In my previous two posts, I discussed how open library metadata is becoming increasingly important for the future of library content, and how OCLC’s new catalog policy works against it. By asserting proprietary rights over the records in WorldCat, OCLC risks relegating libraries’ painstaking descriptions of their resources to the sidelines of Web-based innovation and aggregation. In the long term, this threatens to marginalize libraries, and their missions.
In this last post of the series, I suggest responses that may improve the situation. OCLC’s new policy won’t go into effect until February, and it’s evolved since it first hit the Net a few weeks ago. Commenting on the policy, OCLC vice-president Karen Calhoun has said
We believe that libraries, the wider library-archives-museum community, and those they serve will benefit from the updated policy without placing our shared investment in WorldCat at peril.
I’ll take her word that OCLC believes that they’re acting in the best interests of the members of the WorldCat cooperative. I just don’t think they actually are acting in libraries’ best interests in their current proposal. But ultimately the libraries themselves are the most appropriate ones to determine that, not I, or OCLC’s officials. So my first suggestion is:
Get informed, and speak up. There’s been a lot of discussion about OCLC’s new policy, and you can find pointers to much of the public online debate here. So far, I’ve mostly just seen OCLC officials defending the policy, and various individuals inside and outside libraries criticizing it. The libraries in the WorldCat cooperative need to make their voices heard as well. Whether they’re for or against the policy, they should have a say on the fate of the metadata corpus they’ve worked to support and populate.
OCLC’s new policy doesn’t just affect traditional library catalog applications. The MARC records OCLC manages can hold lots of important additional information about library materials, including digitizations, copyright, and preservation. I’ve advocated cooperative information sharing in these areas (and participated in planning discussions for some of them), and I’ve been pleased to see OCLC’s leadership in these areas. I don’t want to see these initiatives held back by restrictive sharing policies attached to their records.
Know (and assert) your rights. The main legal justifications for restricting the reuse and sharing of catalog records are copyright law and contract law. Contract law may sometimes override rights that one would normally have under copyright (such as fair use and public domain), but generally requires some sort of specific agreement between parties. If you’re working with OCLC, you might want to make sure you don’t sign away rights you might otherwise have to your own cataloging, or to that of other libraries. I’m not a lawyer, and I can’t give you advice on how best to do this, but adding explicit statements of your rights, and modifying or striking out questionable concessions in OCLC contracts, might be in order.
It’s worth noting that OCLC’s policy, in its current draft, specifically says that it does not restrict simply associating an OCLC control number with an otherwise independent record, or a library reusing and redistributing WorldCat records marked as its own original cataloging. Libraries may want to explicitly affirm these prerogatives. (And third parties can freely attach OCLC numbers to their own records, and could make arrangements with libraries that have done original cataloging to free their records.)
The extent to which copyright can apply to catalog records is not a settled matter, to my knowledge, but the US Supreme Court’s Feist decision makes it clear that, under US law, simply compiling facts without “requisite originality” is not sufficient for copyright. It follows straightforwardly, in my opinion, that basic factual citation information in a catalog record (such as title, author, and edition) and other objective facts (such as page counts and physical dimensions) are no more copyrightable than the names, addresses and telephone numbers in the Feist telephone directory case were. In addition, works originally created by federal government agencies (like the Library of Congress) are not subject to US copyright. In other countries, though, copyright or other restrictions might apply to objective facts; and US copyright might apply to more subjective parts of a catalog record (like prose descriptions or subject analysis). If you want to remove doubt about the reusability of your metadata, and prevent it from being made proprietary, I suggest:
Attach a Creative Commons Attribution-ShareAlike license to your records. (Update: But see the discussion in the comments below about alternatives, and pros and cons of different approaches, before deciding to go this route. – JMO) As I’ve noted previously, the Attribution-ShareAlike license allows any sort of reuse of a record, but requires that its source be credited, and that any copies or derivatives of the record remain shareable. Creative Commons licenses specifically reaffirm all the normal rights that users have under copyright law, so it should be safe to add them even to records that might be public domain, without compromising their status. That’s the license I use for my Online Books Page records.
The license should cover the record itself, and records that derive from it. It should not apply to the entire contents of any database that contains the record (as some have suggested). A database-wide viral license has both philosophical and practical problems. (There are lots of reasons why I might want to have an ILS or union catalog that contains both proprietary and Share-Alike records. For instance, I might have various electronic collections for which I’ve previously bought proprietary records, completely independent from my open records.)
OCLC can use Attribution-ShareAlike records in WorldCat and in any of the services they sell, just as any other commercial or noncommercial organization can. (They could also include them alongside proprietary records.) But they can’t add their own restrictions to the records — which I’ve noted previously are incompatible with ShareAlike — without violating the Creative Commons terms, or implicitly admitting that the record is public domain. Thus, it’s possible to pre-emptively “inoculate” records contributed to WorldCat, or to any other aggregation that includes the records, against their being made proprietary.
OCLC could decide to refuse records with Attribution-ShareAlike licenses. But they’d have to go out of their way to do this, and they’d be acting against the stated wishes of the contributor. I think it’s worth seeing if they’d push back like this. If they did, it’d be worth bringing to the public’s attention, and may also be worth distributing the records outside of OCLC. Which brings me to my last suggestion:
Consider alternative methods for sharing library metadata. Just as OCLC is not the creator of most of the library metadata in WorldCat, it is not the only possible coordinator of library metadata. A number of other organizations are also aggregating descriptions of library resources, some for specific applications, like Google’s Book Search, some for social networking businesses, like LibraryThing, some for preservation, like Hathi Trust, some for free-for-all sharing, like Open Library. Other data hubs may arise as well. (Indeed, an open WorldCat could remain a vital hub that enriches those other aggregations, and is in turn enriched from the information aggregated at the other hubs.) Libraries might contribute records to multiple aggregators.
It’s important to remember that any broad-based data cooperative is not likely to completely satisfy all its members. An alternative to OCLC will not necessarily be more open, unless its members hold it to that standard. Reliability, quality control, and seamless interaction are not easy to provide, and participants in alternative networks will have to put time and effort into getting them right.
New cataloging cooperatives could also provide places to experiment with better representations and workflows for library metadata. Implementors need to be careful about getting sidetracked here; past experiences with RDA and other proposed changes to cataloging infrastructure show that new initiatives can be argued over for years without much progress. But if new union catalogs are compatible with existing catalog systems (such as by providing and accepting standard MARC records), and support efficient workflows, they can potentially represent metadata internally in new structures that might be more informative and easier to maintain. This could improve library cooperation and sharing across the board (and maybe improve WorldCat itself in the process).
This ends my discussion, at least for now, of OCLC’s new policy and its conflict with open library metadata. I hope it’s helped inform and advise readers about the debate, and the issues at stake. And I hope it will help readers determine where they stand, and how they should respond.