Everybody's Libraries
Libraries for everyone, by everyone, shared with everyone, about everything
Skip to content
  • Home
  • About
  • About the Free Decimal Correspondence
  • Free Decimal Correspondence
  • ILS services for discovery applications
  • John Mark Ockerbloom
  • The Metadata Challenge
← Mid-20th century newspapers: Minding the copyrights
Forward to Libraries update, and some thoughts on sustainability and scale →

The value of catalogs in the linked data era: Two recent talks

Posted on September 12, 2016 by John Mark Ockerbloom

I’ve recently uploaded two talks I gave this year to my Selected Works site.  I presented “How Not to Waste Catalogers’ Time: Making the most of subject headings” at the Code4lib 16 conference in Philadelphia in March.  “How to Read 100 Million Publications: VIVO and Comprehensive Open Publication Databases” was a talk I gave at VIVO 2016 last month.  The versions I’ve deposited in Selected Works are PDF files that include both slides and notes, so you can see both what I showed, and (approximately) what I said during the presentations.

The talks can be seen as two complementary takes on the question “What’s the value of the catalog, and cataloging, in today’s networked environment?”  In “How to Read 100 Million Publications” I highlight the value of an open, comprehensive database of the scholarly record that goes down to the article level.  Such a database would be quite large (on the order of 100 million records, as the title suggests), and considerably finer-grain that present-day library-run catalogs, which typically catalog at the title level, and only add a little bit of volume information at the holdings level.  But as I describe in the talk, 100-million-record-scale databases are now routinely maintained, and regularly replicated, online, and even a “just the facts” database of who published what articles where could be very helpful in promoting preservation, open access, corpus analysis, research networking systems, and a variety of other applications.  It would not need to be built from scratch– the data required for such a database already largely exists online, though all too often in fragmentary or proprietary collections.  But a number of these could be brought together and opened up, without requiring much additional cataloging labor.  Shared-effort collaborations, like those that support databases like GOKb and SHARE, as well as more crowd-sourced projects like FictionMags or Wikipedia, could support the scale needed.

While “How to Read 100 Million Publications” proposes something new we could do with library catalogs, “How Not to Waste Catalogers’ Time” urges developers to pay close attention to what catalogers have already been doing with them– particularly the often-overlooked semantic richness of their subject cataloging.  At the VIVO conference I saw a demo of a BIBFRAME application where a book’s subjects were represented by an ordered set of FAST properties.  That’s easy enough to model as RDF linked data, but it omits the subject ordering that can improve relevance ranking, and lets readers distinguish the relative importance of subjects in a particular work.  It also doesn’t accommodate the detailed on-the-fly subject categories that coordinated subdivisions allow.  It’s possible to accommodate both of these in linked data, but it requires more complicated data models, and most catalogs don’t take full advantage of their semantic strengths.  My talk shows some examples of systems that can take advantage of them (as well as the strengths of FAST and Wikipedia terminology).  Whether or not we implement such systems broadly, I hope we continue to support ordered, coordinated subject headings in the underlying data of the next-generation catalogs we build.  It’s easy enough to automatically map from such data to FAST term sets, for discovery systems that work best with those, but you can’t reliably and automatically go back the other way and keep the original degree of precision.

In short, I want us to build catalogs that let catalogers and readers do more with metadata, rather than less.  I hope that these two talks suggest some ways we can do that, and I’d love to hear from folks who have more ideas, suggestions, or questions.

 

Share this:

  • Email
  • Print
  • Twitter
  • Facebook
  • Reddit

Like this:

Like Loading...

Related

About John Mark Ockerbloom

I'm a digital library strategist at the University of Pennsylvania, in Philadelphia.
View all posts by John Mark Ockerbloom →
This entry was posted in discovery, metadata, open access, sharing, subjects. Bookmark the permalink.
← Mid-20th century newspapers: Minding the copyrights
Forward to Libraries update, and some thoughts on sustainability and scale →
  • RSS feed
  • Pages

    • About
    • Free Decimal Correspondence
    • ILS services for discovery applications
    • John Mark Ockerbloom
    • The Metadata Challenge
  • Recent Posts

    • Public Domain Day countdown on public social media networks
    • Building a new banned books exhibit for a new era
    • Public Domain Day 2022: Trespassers Will
    • Coming soon to the public domain in 2022
    • Public Domain Day 2021: Honoring a lost generation
  • Recent Comments

    • david on Public Domain Day countdown on public social media networks
    • Rebecca on Public Domain Day countdown on public social media networks
    • sinergio katharismou on Public Domain Day countdown on public social media networks
    • Sandra McIntyre on Public Domain Day 2022: Trespassers Will
    • Chris Rusbridge on Public Domain Day 2022: Trespassers Will
  • Archives

    • November 2022
    • September 2022
    • January 2022
    • December 2021
    • January 2021
    • December 2020
    • March 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • July 2019
    • June 2019
    • January 2019
    • December 2018
    • October 2018
    • June 2018
    • January 2018
    • December 2017
    • September 2017
    • January 2017
    • October 2016
    • September 2016
    • July 2016
    • May 2016
    • January 2016
    • January 2015
    • June 2014
    • January 2014
    • October 2013
    • August 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • July 2012
    • May 2012
    • January 2012
    • October 2011
    • September 2011
    • June 2011
    • May 2011
    • April 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • January 2009
    • December 2008
    • November 2008
    • October 2008
    • September 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
    • March 2008
    • February 2008
    • January 2008
    • December 2007
    • November 2007
  • Access for all

    • Open Access News
  • Copyrights and wrongs

    • Copyfight
    • Copyright & Fair Use
    • Freedom to Tinker
    • Lawrence Lessig
  • General library-related news and comment

    • LISNews
    • TeleRead
  • Interesting folks

    • Jessamyn West
    • John Scalzi
    • Jonathan Rochkind
    • K. G. Schneider
    • Karen Coyle
    • Lawrence Lessig
    • Leslie Johnston
    • Library Loon
    • Lorcan Dempsey
    • Paul Courant
    • Peter Brantley
    • Walt Crawford
  • Metadata and friends

    • Planet Cataloging
  • Shiny tech

    • Boing Boing
    • O’Reilly Radar
    • Planet Code4lib
  • Tales from the repository

    • RepositoryMan
  • Writing and publishing

    • if:book
    • Making Light
    • Publishing Frontier
Everybody's Libraries
Blog at WordPress.com.
  • Follow Following
    • Everybody's Libraries
    • Join 150 other followers
    • Already have a WordPress.com account? Log in now.
    • Everybody's Libraries
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Copy shortlink
    • Report this content
    • View post in Reader
    • Manage subscriptions
    • Collapse this bar
%d bloggers like this: