Karen Calhoun: The Changing Nature of the Catalog and its Integration with Other Discovery Tools


I found this wonderful quote in the midst of writing the below entry. I think it’s wonderfully apropos to the discussion.

Those who assume hypotheses as first principles of their speculations … may indeed form an ingenious romance, but a romance it will still be. –Roger Cotes, Preface to Sir Isaac Newton’s Principia Mathematica, Second Ed., 1713 (though I found the quote in Neal Stephenson’s Quicksilver)

Calhoun, K. (2006). The Changing Nature of the Catalog and its Integration with Other Discovery Tools. Prepared for the Library of Congress.

[Technology has] created an era of discontinuous change in research libraries—a time when the cumulated assets of the past do not guarantee future success. …  The catalog is in decline, its processes and structures are unsustainable, and change needs to be swift. … Notwithstanding widespread expansion of digitization projects, ubiquitous e-journals, and a market that seems poised to move to e-books, the role of catalog records in discovery and retrieval of the world’s library collections seems likely to continue for at least a couple of decades and probably longer (p.5).

“An era of discontinuous change” is the key phrase of the report.  Everything else hinges on this premise.  I believe Calhoun is right–in fact, I think her statement concerning ebooks is over-cautious–looking at the parallel of the music marketplace, mp3s came to dominate with a speed few predicted. (We even have have the ebook-equivalent of the iPod, the iPad, releasing within days. I think that over the next couple decades printed books will become the equivalent of vinyl records: cherished by a niche, ignored by nearly everyone else.  That scenario is far more likely than traditional catalog records maintaining their importance

These are bold claims to make.  I feel confident stating them here, on my lonely corner of the Internet, as they mesh with everything I’ve soaked up from the blogs, magazines, and, yes, books I read.  (I’d could also say that the zeitgeist of the Info Age points in the same direction, but I’d never be that pretentious.)  For Calhoun, in an official report for a federal agency, the standards have to be different.  When I first read the report, I scribbled in the margin next to this introductory paragraph, “bold claims–can they be proven?”  Unfortunately, Calhoun doesn’t clear this hurdle.  The research she cites didn’t strike me as overwhelming as I first read the report, and Mann’s response pretty well shot this aspect of the Calhoun report to pieces (more on this in the next post).

Figure 1. Revitalizing the Research Library Catalog (p.11)

Figure 1 applies [Theodore Levitt’s business strategies for revitalizing products] to extending the life cycle of the research library catalog. … The quadrant on the upper right is the place where transformative, higher risk, long-term, and typically costly strategies reside (p.10).

The other three quadrants are half-measures. Only the audacious goals indicated by the top-right quadrant (develop new uses and new users for the catalog) address an environment of “discontinuous change.”  One can question the basic premise of the Calhoun Report (as Mann and Yee strenuously do), but if you accept discontinuous change as a starting point, the rational response is to match the changing environment with an equally radical change.  It comes down how one gauges the chance that the Information Age  presents an existential threat, not just to catalogs but to libraries.

In a later figure, Calhoun offers 32 “remedies;” at the top is “participate in the substitute industry” (figure 2, p.14).  This isn’t the first time I’ve had this thought, and others have written on the subject, I’m sure, but I imagine there’s an alternate universe out there where libraries aggressively pursued organizing electronic information from the early days of the internet forward.  Where companies like Altavista, Yahoo, and Google never rose to prominence or perhaps never existed, because libraries were already there.  In other words, a world where libraries were out in front of the curve.

“Stated in business terms, the library catalog can be said to be in a declining stage of the product life cycle” (p.26).

Google’s mission is to organize the world’s information and make it universally accessible and useful.”  Isn’t that the exact mission of libraries?  How can libraries hope to compete with Google, which throws massive amounts of talent, money, and technology at the problem, and by all appearances is doing an exceptional job of doing so?  Should libraries compete?  I’d prefer to live in a world where libraries fulfill that role (and where they are adequately funded to meet that challenge) but it’s entirely possible that that ship has sailed.  Maybe research  libraries (or at least research library catalogs) should be niche products, primarily oriented towards in-depth research.  That’s not exactly what Mann argues but it is the logical conclusion from his position.  Not so fast though.  That path relies on an assumption as radical (to my eyes) as Calhoun’s: that information of value to serious researchers is almost exclusively found in books, proprietary journal articles, and other traditional library holdings, and that this state of affairs is likely to continue for the foreseeable future.  Simply put, I think this premise is wrong.

The Catalog’s Unique Advantages

… As one interviewee put it, “A user who knows how to search the catalog gets excellent results” (p.31).

Delivery of local holdings. One interviewee captured this notion by saying “the
catalog has a unique benefit when it provides access to information not available via
search engines, that is, information available only inside libraries” (p.32).

This first statement is the Linux argument–“computer users who know what they’re doing, who take the time to really learn how to use their computers, will find Linux far superior to Windows.”  The result?  Linux’s desktop market share is in the 1-2% range and Microsoft is around 85%.  Every defender of a lousy interface in history has made the same argument.

As for the second quote, I find it to be an example of the perverse logic clung to by many in the library world: it’s only true because libraries haven’t exposed their holdings.

I posted the following in my other class (Cataloging the Web) earlier this semester:

Recently, on the This Week in Tech podcast (twit.tv), I heard an interesting comparison of a difference between Apple and Google. In reference to the Apple App Store, one of the commentators said that Apple’s model was to have the equivalent of a giant room full of employees doing nothing but manually approving each addition to Apple’s offerings. In this way, Apple maintains absolute control of its users’ experiences. I can’t remember exactly what Google product was mentioned in comparison, but the important point was this–Google is *not* offering a certain feature for now because they can’t figure out how to automate the process that goes along with it. The point being, Google absolutely refuses to pursue offerings that require manual control/input because only automated processes are capable of “scaling up” to the enormous number of uses Google products have to cope with. Despite the mainstream success of iPods and iPhones, Apple still has the soul of a niche company. Google, on the other hand, is entirely a creature of the Internet Age–it works, and *only* works, on a massive (and necessarily massive) scale.

That long winded intro does lead back to Cataloging, I swear :) Doesn’t the Apple model sound a lot like a Cataloging Department? In Organizing Audiovisual and Electronic Resources for Access, Hsieh-Yee asks, “will cataloging have a role to play in the organization of information in the twenty-first century?” I would ask, “*can* it have a role? If Cataloging is stuck in the “Apple model,” can it scale to 24x7x365 geyser of information object creation that is the internet? Or is a fundamental change required to keep up?

Returning to the Calhoun Report, I find I made a similar note: Google refines its algorithm continuously, analyzing its unimaginable large collection of user behavior with great precision to improve its results in myriad ways (whoa, four hyperbolic adjectives in a row…I’ll try to rein in the fanboy).  Why aren’t libraries also doing this?  Libraries track checkouts on an item level, yes, and website pageviews, but if anyone’s doing large-scale, ongoing tracking of catalog users’ behavior, I haven’t heard of it.  Privacy is one of the ethical tenets of librarians, to be sure, but it’s possible to aggregate such data without compromising individual privacy.

It is not surprising, then, that for a number of interviewees, the question of the catalog’s integration with other discovery tools orbited around getting a Google user from Google to library collections. Several noted the importance of the interface between the library and Google. One remarked, “In the best of all possible worlds, people could search Google and library resources together [on Google].” Another noted “data about a library’s collection needs to be on Google and other popular sites as well as the library interface.” One interviewee, however, was cautious about such an approach because of the extent to which catalogs contain surrogates pointing to physical locations. This interviewee said that indexing library catalogs for Google searching “would be antisocial, because it would introduce millions of records of noise into Google. OCLC and others have experimented with exposing union catalogs on the Web … we’re still very early in learning how to do this effectively. Google can deliver instant gratification. Libraries don’t typically do that, especially with their physical holdings” (p.37).

The “catalogs contain surrogates” argument strikes me as a rationalization.  If catalogs don’t link to full text, the way most of the web does, is that an argument for preserving the status quo, or does it tell us that catalogs are falling behind the times?  I find the “antisocial” argument equally uncompelling. For starters, Google indexes well over a trillion webpages already, of all shapes, sizes, and quality–I doubt that catalog records would “pollute” Google results in any way. Second, exposing catalog records would exert pressure on libraries to improve catalogs, and I think that’s a good thing.  If countless users start landing on their library’s Twilight record, only to find that there’s no mechanism for them to, say, download an ebook version, they’re going to complain.  Then libraries will have to figure out how they can respond (and maybe negotiate with publishers on widespread lending of ebooks.)

Interviewees also suggested more interactive catalogs—letting users give feedback (such as reviews), giving users more power to control transactions (such as interlibrary loan or payments), offering RSS feeds or canned queries (such as for new books), permitting social bookmarking, and providing new output options (p.40).

I should be used to it by now, but I’m still astonished at how far most catalogs continue to lag behind the rest of (or at least the best of) the web.  User interaction? Online account management?  RSS?  Social functions?  This is basic stuff.  Are libraries ever going to stop lagging and start innovating with regard to the web?

Overall, I was pretty impressed by Calhoun’s report, and surprised that such progressive thinking came from the LOC.  Calhoun’s bold premise, that research libraries face an era of discontinuous change, requires strong evidence to back it up and Calhoun falls short in this regard.  Furthermore, the report is over-diplomatic in presenting a range of possible responses to the serious threat the Information Age poses: the most appropriate response is radical  change, and failing to adapt is the riskiest course of all.

