Everybody's Libraries

January 10, 2008

Subjects are more than just facets (and an ALA talk plug)

Filed under: architecture,libraries,subjects — John Mark Ockerbloom @ 4:55 pm

The Library of Congress’ Working Group for the Future of Bibliographic Control announced its final report today. I haven’t yet read over the final version, but I read an earlier draft, and was particularly interested in what it had to say about subjects.

“How should we offer searching in library collections?” is a question that lots of libraries are asking. The answer heard a lot nowadays is “Facets!” Facets have been used in databases and e-commerce sites for some years now. Essentially, they define several (ideally independent) attributes for items, and then let users zero in on what they want by selecting and deselecting various attributes. For example, if you go to Amazon to buy shoes, you can select values from facets like brand, size, color, and price range. Try different selections, and you can quickly pick out the few pairs that best meet your needs out of the tens of thousands offered on the site. (Assuming you’re willing to buy shoes without trying them on.)

The Endeca catalog at NC State applies the same idea to finding books in the library. When it came out two years ago, lots of library folks got excited. And when open source tools like Solr made it easy to code up your own faceted catalog, it came as no surprise that lots of folks set out to try facet-based discovery for their collections. These new catalogs are in many ways big improvements over existing catalogs. Though, as K. G Schneider and others point out, that’s not a high bar to clear.

We too use facets in some new applications we’re building here at Penn. But they don’t entirely work well with subject headings. Kelley McGrath’s article “Facet-Based Search and Navigation: Problems and Opportunities” in the inaugural issue of the Code4lib Journal describes some of the practical problems involved.

Some have said that subject headings should change to be more facet-oriented. That’s the recommendation of the Calhoun Report commissioned by the Library of Congress that was released in 2006, which recommended dismantling the Library of Congress Subject Headings (LCSH), now the most common subject headings vocabulary. The more recent report from the Future of Bibliographic Control doesn’t go that far, but it does recommend transforming LCSH, “de-coupling subject strings” and evaluating LCSH’s ability to “support faceted browsing and discovery”. The FAST system, which breaks up subjects into uncoordinated facets, is mentioned as an interesting technology to pursue.

LCSH indeed has several problems associated with it: people have a hard time finding the appropriate subject terms for what they’re looking for; catalogers have a hard time constructing terms that follow all the LCSH rules; terms are used inconsistently across collections; terms are slow to adapt to contemporary usage; and both “traditional” and faceted library catalogs have a hard time connecting related terms together using LCSH.

Should we, then, dismantle LCSH into a simple system of facet sets? Not so fast, I say. Subjects are inherently messy things, neither fully discrete nor hierarchical, and in a large collection it’s important to be able to zero in on specific subjects through relationships. Not only is there a large installed base of materials already described with LCSH, but LCSH and ontologies like it allow books to be described with greater precision, and with richer relationships, than pure facets allow. (See Thomas Mann’s “The Peloponnesian War and the Future of Reference, Cataloging, and Scholarship in Research Libraries” for a spirited argument for the power of LCSH-style subject headings.)

What we really need are better tools that allow readers and catalogers to take full advantage of rich subject headings and relationships, and make it easier for subject headings systems to evolve more quickly to meet the needs of users. A technology I’m experimenting with now, and calling subject maps, involves networks of related subjects, techniques for enriching those networks through automation and user input, and displays that let users and librarians browse large collections by navigating through complex subject areas. Subject maps can play well with facets and user-assigned tags, to produce discovery systems that offer the best features of all of these technologies.

Too good to be true? If you want to hear more, see a demo, or ask how this would actually work, come see and/or heckle me on Saturday at ALA. I’ll be presenting at the Catalog Form and Function Interest Group, at 10:30 AM in the Versailles Room of the Sofitel Philadelphia. For more info, and for other ALA forums that may be of interest to metadata librarians, see this post on the ALA blog.

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 76 other followers

%d bloggers like this: