Everybody's Libraries
Libraries for everyone, by everyone, shared with everyone, about everything
Skip to content
  • Home
  • About
  • About the Free Decimal Correspondence
  • Free Decimal Correspondence
  • ILS services for discovery applications
  • John Mark Ockerbloom
  • The Metadata Challenge
← New IMLS-funded project: Opening access to 20th century public domain serials
Public Domain Day 2018: The 20-year alarm clock →

Adding more, and more structured, information about public domain serials

Posted on December 11, 2017 by John Mark Ockerbloom

I’ve been happy to hear from a number of people and institutions interested in the IMLS-funded project we now have underway to shed light on the hidden public domain of 20th century serials that I discussed in my last post.  I gave a short 6-minute presentation on the project at this fall’s Digital Library Federation Forum, and you can find my presentation slides and script on the Open Science Framework.   (You can download either a Powerpoint file or a PDF as you prefer; both have the slides and the notes.)

With the help of Alison Miner and Carly Sewell, I’m now starting to add data to the inventory of serials that is one of the deliverables of the project.  Right now, we’re putting up data from 1966 renewals, where serial contributions from 1938 and 1939 were renewed.  But there’s quite a bit more data in the pipeline, and I hope that we’ll have all of the 1930s covered by the end of this month, and then advance rapidly into the 1940s in the new year.  (Our goal is to get up to 1978’s renewals, which will finish off the 1940s and get to 1950, and from there on the Copyright Office’s online database can be consulted for serial renewals.  We’re aiming to have that completed sometime in the spring.)

I’ve heard from various people who are interested in clearing copyrights for their own serials digitization projects– as well as some projects that are doing it already, like the Hevelin Fanzines project that was also discussed at the DLF Forum.  As I mentioned in my previous post, we intend to publish suggested procedures for doing such copyright-clearing.  We’ll be preparing drafts of such procedures in the new year, and we’ll let interested folks know when such drafts are available for comments and suggestions.

We’re also happy to hear suggestions about other aspects of the project.  One suggestion we heard in an early presentation was the inventory should be shared as downloadable structured data, and not just as a big web page, so that it would be easier for people to repurpose and automatically analyze the data or various purposes.  That sounded like a good idea to me, and I got more excited about its possibilities when looking at all the work that people were doing with projects like the FictionMags project, ISFDB and similar projects, where volunteers have crowd-compiled detailed structured contents information for a large number of serials.

So the new data going into the inventory is going in as structured data files, and we’re also slowly refitting existing entries into such files.  That’s meant a bit of a slower startup than we’d originally planned, but we believe this work will pay dividends in the future.  Already it means that we can reuse the same data in multiple contexts– for instance, we can show first-renewal information for Adventure magazine on the Online Books Page issue listings, on a copyright information page, and in the big inventory page.  Updating the underlying structured data file can change what appears in all of these contexts.

Moreover, data structures are expandable.  When readers asked me to list Amazing Stories and Galaxy science fiction magazines on The Online Books Page, I had to determine which parts of their runs were public domain and thus could be listed without any further inquiries into permission.   I looked up copyright renewals for these magazines and then recorded renewal data in the same sorts of structured data files so it could be reused.  (Here, for instance, is the automatically generated copyright information page for Amazing Stories.)  I also added structured data fields that allowed the inclusion of name identifiers, in particular, Library of Congress Name Authorities, so that authors could be consistently identified, and then linked to other information about them, such as contact information for permissions.)  With these links, and with the links to full issue tables of contents compiled by other projects, it becomes easier to digitize nearly any story in the early years of the magazine, by checking to see whether there is still an active copyright on it, and by sending an inquiry for permissions if there is one.

To be clear: I’m not going to compile lists of all renewals for all the serials in my inventory.   That would take more time than I have– the scope of my 1-year grant only covers the first renewals for the serials that have them, up to 1950.  But if I can create more detailed renewal lists for Amazing Stories using the defined structure, then others who are interested could create similarly structured renewal lists for other serials.  And maybe those lists could be linked, shared or distributed from my inventory as well, if there’s interest.

So before I get much further into the project, I’d like to hear from folks who might be interested in using or compiling this sort of detailed renewal information.  Is this sort of structured information useful to you?  And if so, will the format and structure I’ve defined for the data files work?  Or should it change (something that’s easier to do now than later), or be augmented?

I didn’t find any pre-existing schema that covered detailed article-level copyright renewal data, so I decided to roll my own for starters.  There’s a variety of encoding schemes one could use for it, including XML, JSON, and the various RDF formats.  I figured JSON would be easiest for laypeople and librarians to understand and edit in its  native format, and it can be automatically translated into suitable XML or RDF schemes if desired.  But if you know of good reasons for preferring a different native format, or know of schemes I should reuse or extend instead of reinventing the wheel, I’d be interested in hearing about them.  (I’m especially interested if the format or schema is already in common use by the sorts of folks who compile serial contents information.)  Alternatively, if what I have now is as good a starting point as anything else, I’d be happy to know that, and could then take the time to write up formal documentation for it.

To see the existing files I have, you can go to the big inventory page and follow the “More details” links that you’ll see for certain serials in the list.  These lead to copyright information pages for the serials in question, which in turn have links to JSON files at the bottom of each page.  I also have most of the JSON files in a Github folder that’s part of my Online Books Page project repository.

If you work with this sort of metadata, or would like to, I’d love to hear from you.  If we get this right, I hope this data will spark all kinds of useful work opening access to a wide variety of 20th-century serial publications.

 

Share this:

  • Email
  • Print
  • Twitter
  • Facebook
  • Reddit

Like this:

Like Loading...

Related

About John Mark Ockerbloom

I'm a digital library strategist at the University of Pennsylvania, in Philadelphia.
View all posts by John Mark Ockerbloom →
This entry was posted in copyright, metadata, open access, science fiction, serials, sharing. Bookmark the permalink.
← New IMLS-funded project: Opening access to 20th century public domain serials
Public Domain Day 2018: The 20-year alarm clock →
  • RSS feed
  • Pages

    • About
    • Free Decimal Correspondence
    • ILS services for discovery applications
    • John Mark Ockerbloom
    • The Metadata Challenge
  • Recent Posts

    • Public Domain Day countdown on public social media networks
    • Building a new banned books exhibit for a new era
    • Public Domain Day 2022: Trespassers Will
    • Coming soon to the public domain in 2022
    • Public Domain Day 2021: Honoring a lost generation
  • Recent Comments

    • david on Public Domain Day countdown on public social media networks
    • Rebecca on Public Domain Day countdown on public social media networks
    • sinergio katharismou on Public Domain Day countdown on public social media networks
    • Sandra McIntyre on Public Domain Day 2022: Trespassers Will
    • Chris Rusbridge on Public Domain Day 2022: Trespassers Will
  • Archives

    • November 2022
    • September 2022
    • January 2022
    • December 2021
    • January 2021
    • December 2020
    • March 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • July 2019
    • June 2019
    • January 2019
    • December 2018
    • October 2018
    • June 2018
    • January 2018
    • December 2017
    • September 2017
    • January 2017
    • October 2016
    • September 2016
    • July 2016
    • May 2016
    • January 2016
    • January 2015
    • June 2014
    • January 2014
    • October 2013
    • August 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • July 2012
    • May 2012
    • January 2012
    • October 2011
    • September 2011
    • June 2011
    • May 2011
    • April 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • January 2009
    • December 2008
    • November 2008
    • October 2008
    • September 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
    • March 2008
    • February 2008
    • January 2008
    • December 2007
    • November 2007
  • Access for all

    • Open Access News
  • Copyrights and wrongs

    • Copyfight
    • Copyright & Fair Use
    • Freedom to Tinker
    • Lawrence Lessig
  • General library-related news and comment

    • LISNews
    • TeleRead
  • Interesting folks

    • Jessamyn West
    • John Scalzi
    • Jonathan Rochkind
    • K. G. Schneider
    • Karen Coyle
    • Lawrence Lessig
    • Leslie Johnston
    • Library Loon
    • Lorcan Dempsey
    • Paul Courant
    • Peter Brantley
    • Walt Crawford
  • Metadata and friends

    • Planet Cataloging
  • Shiny tech

    • Boing Boing
    • O’Reilly Radar
    • Planet Code4lib
  • Tales from the repository

    • RepositoryMan
  • Writing and publishing

    • if:book
    • Making Light
    • Publishing Frontier
Everybody's Libraries
Blog at WordPress.com.
  • Follow Following
    • Everybody's Libraries
    • Join 150 other followers
    • Already have a WordPress.com account? Log in now.
    • Everybody's Libraries
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Copy shortlink
    • Report this content
    • View post in Reader
    • Manage subscriptions
    • Collapse this bar
%d bloggers like this: