“What do you read, my lord?” “Works, works, works”

There’s been a lot of talk lately in the library world about the coming age of FRBR-ized library catalogs (prompted in part by development of RDA, a cataloging standard that uses FRBR). Exactly what such catalogs will look like, and whether they will actually help readers use the library more effectively, are matters of ongoing debate. One of the key differences between FRBR and older catalog models is that books and other resources that share common properties can be grouped together at various levels of abstraction.

FRBR, highly abridged

Such grouping can be helpful, for instance, for people looking for a suitable copy of Shakespeare’s play Hamlet. In a traditional library catalog, a title search for Hamlet might yield a long list of hundreds of hits, in which it is difficult to select a particular copy, or to find other appropriate search results (like William Faulkner’s book The Hamlet) among all the hits for various versions of Shakespeare’s creation. In a FRBR-ized catalog, though, the various editions (or “Manifestations“, in FRBR-speak) of Hamlet can be grouped into various “Expressions“, denoting particular texts of Hamlet, and those Expressions can be grouped into a single “Work” denoting Shakespeare’s dramatic creation.

Ideally, a general search in a FRBR-ized catalog for Hamlet would yield one hit for Shakespeare’s Hamlet as a Work, instead of hundreds of editions (unless the search terms implied a more specific edition or version). Users could then examine the Work hit to choose an appropriate copy; or they could more easily find other “Hamlet” hits if they really wanted something else, like the similarly-titled book by Faulkner. (The specific copy that the user finally obtains is known as an “Item”, and represents a level of abstraction below Manifestation. The full resource abstraction stack is sometimes referred to as “WEMI”, short for “Work, Expression, Manifestation, Item”.)

In practice, there have been plenty of arguments over the details of the FRBR information model, such as where exactly the distinction is made between a Work and an Expression, or where one should model multiple Works instead of one. Ultimately, though, FRBR is intended to make it easier for users to find, identify, select, and obtain library resources. Each of these four verbs is explicitly defined as a FRBR “user task”. The information model is useful only to the extent that it supports these user tasks.

FRBR-like catalogs on the Web today

Even while libraries are debating whether or not they will develop FRBR-based catalogs, several FRBR-like catalogs have appeared on the Web, and are being used by growing audiences of online users. They have some interesting things in common. They all support some degree of representation of works at a higher level than the traditional library records. And they all honor the FRBR information model more in the breach than the observance None of them use the FRBR WEMI model as-is, at least in the user interface, though many of their designers cite FRBR as an influence.

Typically, the “FRBR-like” catalogs now in common use show “works” and “editions”. “Works” here refer to something like FRBR Works or Expressions; the distinction between the two is not usually made in the user-facing production catalogs I’ve seen. “Editions” refer to particular FRBR Manifestations. These catalogs consolidate search results that all fall under a particular “work” (thus streamlining the FRBR “Find” task), and also offer “work”-oriented displays that include details about the “editions” encompassed by the work (thus supporting the “Identify” task). They do not give much explicit assistance towards Selecting (though most traditional library catalogs do not either). And in many cases the metadata for editions is not consistent enough to support a consistent Work grouping throughout the catalog (let alone one that is consistent with other catalogs).

Example 1: OpenLibrary

As an example of how this works in practice, consider the user searching for “Hamlet” in the Open library. Once we get past an oddly malformed first hit, the second hit is for the play “Hamlet” as a work, consolidating 665 editions beneath it. (We see also Faulkner’s The Hamlet right below that, instead of several screenfuls later.)

If we click on the hit for the play, we then go to a work page that shows information about the play in general, followed by the list of 665 associated editions. There’s not a whole lot here to help users choose a good edition, though the display does show covers or title pages when available. Also, the online editions, which are the easiest for Web users to obtain, are shown first. (The online book metadata comes from the Internet Archive, the sponsor of Open Library.)

Example 2: WorldCat.org

OCLC uses similar technologies in its Worldcat.org and WorldCat Local products. Searching for Hamlet in WorldCat.org, for instance, we get a hit for the play followed by hits for various movies and books. If you click on the title “The tragedy of Hamlet, prince of Denmark” in the first search result, you’ll just see information on the featured edition (from Yale University Press). But if you click instead of “View all editions and formats” note at the bottom of the first search result, you’ll see that over 4000 editions are lumped under this work. You can cut down this list somewhat using facets that specify format (book, video, audio), language, and/or year of release.

As far as I know, OCLC does not maintain an explicit work record for Hamlet; a lot of the grouping here appears to be automatically inferred from existing ‘edition’ records. The list contains a wide variety of Hamlets, including translations, audio readings and performances, and videos of stage productions and film adaptations. But it’s an incomplete list; the DVD of Kenneth Branagh’s film of Hamlet, for instance, shows up in the initial Hamlet search results, and doesn’t seem to included as one of the DVD editions under the first “Hamlet” hit.

Example 3: LibraryThing

In LibraryThing, the work is the thing, and editions are secondary. Their work page for Hamlet combines information on the play with information on various editions, without saying very much about editions.

Since most LibraryThing users are more concerned with a work in general than with a particular edition, this sort of catchall display works reasonably well in context. Amazon has done similar consolidation in its catalog for years, which has well-known pros and cons. (You’ll see Amazon reviews of multiple editions of a work together, for instance, which is useful if you want a wide range of views but possibly misleading when a reviewer complains about the features of a specific edition.) Amazon still gives support to identifying and selecting particular editions, though, since they need to distinguish them for customers. Although LibraryThing, not driven by sales in the way that Amazon is, gives less edition-level support, they do point to 9 distinct Amazon records for Hamlet.

In a few cases, specially notable editions of Hamlet, such as this Norton Critical edition, get their own pages on LibraryThing. This is useful for folks who want to record information about an edition that’s particularly significant to them. But at present, there’s no easy way within LibraryThing to link the generic work pages with the edition-specific pages, so information gets somewhat fragmented.

Example 4: Google Books

The widely used Google Books also tries to aggregate similar books together in somewhat FRBR-like ways. Different “editions” are consolidated in search results, and related books are linked together. Unfortunately, Google’s metadata is dirty enough that the “FRBR-ization” is only partial, and confusing. Different “Editions” of a book may simply be different volumes from the same multi-volume edition. And editions of the same work may sometimes be combined together, and sometimes left separate, seemingly at random.

If you do a search for Hamlet in Google Books, each of the hits aggregates lots of volumes, but this is not immediately apparent. If you click on the “Book overview” for a hit, you’ll see some of these volumes listed under “Related books” and “Other editions”. The metadata is not consistent enough to be properly “FRBR-like”, but that seems to be what they’re aiming for.

Users not familiar with the quirks of Google Books may miss some volumes that are actually in the system. For books less popular than Hamlet, users may see a much shorter list of hits that does *not* include the particular volume or edition that a searcher is looking for, and not realize they have to look through “Other editions” under the hits shown to find the volume they want. For instance, this search for Oldcourt (as I write this) makes it appear as if Google has only two of the three volumes of this novel available in full view. In fact, they have all three volumes, but the results can mislead even experienced literary researchers.

Reflections

All of the catalogs above, then, are somewhat “FRBR-like”, but they don’t fully implement the FRBR functional or data model. I’m not sure, though, how closely they need to conform to those models. I can see room for improvement in each catalog, but they all seem to work well enough to have gained notable user communities.

How should a “FRBR”-like catalog treat Hamlet? Personally, I like the rich work-level data (both formal and infomal) that I find on pages like LibraryThing. I also like the easy access to online copies provided by Open Library, the faceting and wide-angle “work” aggregation of WorldCat.org, and the scale and full-text searchability of Google Books. On the other hand, I’d like to see more consistent grouping and descriptions in each of the catalogs, and more assistance for users in selecting an appropriate edition than I currently see in any of them.

Out of hundreds of editions of Hamlet, some are particularly useful for various audiences, such as students trying to understand Elizabethan speech, actors and directors preparing to perform it, literary researchers examining and comparing source texts, and cultural historians considering famous stagings and adaptations. Why not include more data highlighting editions that are particularly useful for these and other purposes? I don’t recall such metadata being a significant part of the FRBR data model, but it seems to be in the spirit of the functional model of helping people select the right book for their needs.

On the other hand, I don’t see a great need to make a lot of differentiation between works of Hamlet and expressions of Hamlet. Insisting on lots of sharp distinctions between various high-level records in the user interface could well confuse users more than it helps them, unless there are effective ways of presenting them as a unit when appropriate.

What’s next?

As my library and others roll out the next generation of library management systems and discovery tools, determining the best use of FRBR-ish catalog data and applications may make a big difference in the effectiveness of our services to patrons. So I’m very interested in seeing how well catalogs and records designed along FRBR lines work in practice. I’m also piloting some prototypes of FRBR-like features on The Online Books Page, and I hope to have more to say about them shortly. (Update: Now I do; see my next post.)

In the meantime, I’d love to hear comments from readers on any of the catalogs I’ve mentioned, or examples of other “FRBR-like” catalogs in use that are worth looking at.

6 Responses to “What do you read, my lord?” “Works, works, works”

jrochkind says:

September 9, 2010 at 1:43 am

The only real point of a shared model like FRBR is if we’re actually going to try and use cooperative cataloging to collectively share information about groupings of manifestations (the things we traditionally catalog, more or less) into editions and works.

If we’re not really going to do that (and so far, there is no indication that anyone seriously plans to, not really even RDA), and it’s every-enterprise-for-itself using it’s own algorithms to try and automatically group… then just “FRBR-like in the way we managed to figure out that seemed the appropriate cost-benefit balance for our own capabilitie and our understanding of our own users needs” is just fine. There’s no reason to standardize just for the hell of it. The reason to standardize is to share data. If there’s no serious plans to do that, then there’s really no point to it.
Dorothea Salo says:

September 9, 2010 at 7:35 am

Hm. I think there’s something to be said for standardization as a way to help patrons predict interactions with software. Data-sharing isn’t the only possible reason for standards, and in this case I’m not at all sure it’s the only reason.

Which isn’t to say we’re doing a good job around FRBR and RDA, of course.
susan says:

September 9, 2010 at 10:14 pm

I like your term “FRBR-ized”!

I have often felt that there should be an option in our union catalog for a patron to place holds without specifying an edition – instead request the “work.” The reader could obtain his book more expeditiously.

The addition of RDA to the MARC record is moving in the right direction.
Karen Coyle says:

September 11, 2010 at 2:05 pm

Open Library has two other FRBR entities that it implements: authors (frbr:Person) and subjects. The latter are divided into “facets” based on LCSH: people, places,topics and times. It is not as easy to get to these as I would like, but if you click on an author name in a display, you get the page for the entity “author”.

http://openlibrary.org/authors/OL9388A/William_Shakespeare

From there you can get to the subject Denmark (under Places on the Shakespeare page):

http://openlibrary.org/subjects/place:denmark

and this gives you a publishing timeline, more subject facets, prolific authors and prolific publishers, all related to that subject. Note that both of these are generated in real time from the underlying data, they aren’t hand-created pages.

This is getting us closer to “real” FRBR, which is 3 groups of inter-related entities. Some entities, though, are very hard to manage given the structure of our current data, but I’m hoping that Open Library can tease out a few more, although there will be problems of precisions (e.g. publisher names, which are uncontrolled).
John Mark Ockerbloom says:

September 15, 2010 at 2:40 pm

I’m glad to see Open Library is doing more with author and subject pages. While I’m focusing on the FRBR Group 1 entities in this particular series, there’s a fair bit you can do with the Group 2 and 3 entities. (I gave some examples a while back in my earlier series on concept-oriented catalogs.)

I’ve seen a lot more debate about the practical role of the Group 1 WEMI stack than I have about the other FRBR entities. I’m finding it useful both to look at the way web catalogs like Open Library are handling these sorts of abstractions, as I did in this post, and at what seems to make the most sense in my own catalog implementation. I’ll discuss that, and the general data model that seems to work well there, in my next post.
elizabethwillse says:

September 16, 2012 at 8:05 pm

Thank you, this actually helped me understand FRBR for my knowledge organization class!

Comments are closed.