I gave a talk at the BooksOnline workshop at CIKM 2010, on metadata and online book collections. The workshop was held on October 26 in Toronto. This page includes the abstract, presentation materials, and supplementary materials as well.
With millions of books, serials, and other documents now digitized, rich troves of information and culture can now be made available to anyone in the world with an Internet connection. But these riches are worthless if they cannot be found, accessed, ond effectively used by the readers who need them. The key to unlock these treasures is metadata. Networked computing enables techniques for making metadata more effective than ever; yet in practice, online collections all too often either do not have or do not take full advantage of the best metadata they could use.
There is much ongoing work harnessing metadata to improve online book discovery, access, and usability. Online book discovery is being enhanced with concept-oriented catalogs of various kinds, including browsable maps relating millions of subjects and associated books. Copyright metadata is starting to open access to many books that had been needlessly withheld from the public, while also reducing the risk of inadvertent infringement. Structural and relational metadata and annotations are making complex works much more usable than they were when they were represented as a mishmash of volumes.
Using metadata effectively in multi-million-volume collections poses special problems of scale. Solving these problems requires considered application of both library science and computer science. It also requires harnessing the collective intelligence of readers, writers, librarians, and publishers. Wise metadata management policies, including open data sharing, can promote the effective aggregation of human and machine intelligence at the scale we will need. This talk will demonstrate and describe ways in which we can meet the metadata challenges of large-scale online libraries both now and in the future.
Selected background materials
Below are links to previous writings of mine on some of the issues mentioned in the abstract above.
- A series of blog posts on concept-oriented catalogs.
- A paper looking at metadata (and more specifically, provenance information) required for copyright clearance.
- A position statement “Open bibliographic data promotes knowledge of the public domain” from the Open Knowledge Foundation blog.
A PDF with slides from the talk, along with my notes, can now be found on my Selected Works site.