I’ve seen a fair bit of buzz of the Library of Congress’s recent announcement of a new initiative for transforming the bibliographic framework of the library community. The announcement notes a number of recent developments that are drivers for such a transformation, including RDA, FRBR, the semantic web of linked data, and a growing consensus that the library community needs to move beyond MARC as its central data standard.
There’s another very important driver not explicitly mentioned in the announcement: the rise of open library data. More and more bibliographic and other library-related data is now freely available and reusable online, and it enables all kinds of improvements in resource discovery. I’ve previously discussed how open data for Library of Congress Subject Headings and Hathi Trust online books allows me to manage a catalog of over a million books, create rich subject browsing interfaces, and improve the quality of library catalogs at Penn. OCLC has also compiled data on millions of authors in VIAF, and now makes it available as open data. I recently downloaded their data set, and hope to get at least as much benefit out of is as I have with subject data to date. (I hope to report on early experiments before long.)
And there’s more big news from OCLC. At last month’s Global Council meeting, Karen Calhoun gave a presentation saying they were considering releasing letting members release WorldCat data under an open license. I’m very excited to see this development; as I said a couple of years ago (in a panel discussion that also featured Karen Calhoun) it would make a huge trove of the library community’s bibliographic intelligence available to exploit and extend in all kinds of ways that could make it easier for library patrons to discover information resources. This is no small step for OCLC– at the time of our panel, OCLC was concerned that opening access to WorldCat data might threaten the sustainability of the WorldCat cooperative. If they’re seriously considering opening their data now, I suspect that a good number of WorldCat members have indicated that it’s an important and worthwhile thing to do, and that the benefits outweigh the risks.
If you also see potential in open library data, now is an excellent time to join in the discussions that the Library of Congress and OCLC are inviting. The more these and other leading organizations in the library community see how open data can advance the goals of the community, and how open data initiatives can get the support needed to be sustainable, the richer the knowledge base that our evolving bibliographic framework will support.
I don’t yet know where and how the LC and OCLC conversations will be taking place. But I’d love to hear readers’ thoughts and pointers about using and supporting open bibliographic data in the comments here.
Updated May 27 to clarify: I’ve heard from Roy Tennant that the OCLC proposal under consideration refers to WorldCat members releasing “their catalog data containing WorldCat records”, not OCLC itself releasing data (as my post originally stated). This is still potentially a big deal, even if it doesn’t mean opening up all of WorldCat. I’ve updated the post, marking edits with strikethrough and bold, and will talk about this more in the comment thread.