Everybody's Libraries

The remainder of the Roaring 20s about to join the public domain

Posted on November 1, 2024 by John Mark Ockerbloom

Just two months from now, much of the world will celebrate another Public Domain Day, welcoming a year’s worth of works into the public domain. Many countries that have had life+70 years copyright terms for a while will get works by authors who died in 1954. Those still fortunate enough to still have life+50 years terms will get works by authors who died in 1974. The rules in the United States are more complicated, but we’ll have nearly all our remaining copyrights from 1929 expire. That means that, for us, essentially all of the publication history of the “roaring 20s” will be public domain when the new year arrives.¹ That’s a wide sweep of culture available for everyone to enjoy, share, build on, and reuse.

The Twenties encompass the start of national women’s suffrage, the rise of the Jazz Age and the Harlem Renaissance, and the dawn of “talking” motion pictures, and extend to the “Black Tuesday” stock market crash and the beginning of the Great Depression. The Twenties had political upheaval to match the cultural and economic upheaval, including civil war in Ireland and many other places around the world, the birth of fascism in Europe, and the revival and decline of the Ku Klux Klan as waves of anti-immigrant and racist sentiment washed over much of America. But the decade also saw widespread international efforts to try to end war generally among nations. While the 1928 pact that many nations signed on to has often been viewed as a failure for not preventing World War II, it set a precedent for later international cooperation and peacekeeping efforts that can be credited with more success.

As I have in past years, I’ll be featuring a Public Domain Day countdown in the days leading up to New Year’s Day 2025, each day featuring an interesting work that will be joining the public domain then. You can follow it on this blog, or using RSS readers or social media that can connect with this blog. That includes Mastodon and other “fediverse” sites that connect with Mastodon using the ActivityPub protocol. I’ll also boost or link to the daily posts from my Mastodon account. (Most of the posts will have 500 characters or fewer, the size of a typical Mastodon post; a few may be longer.) You might also be able to follow my boosts and links from Bluesky (since my account is hooked up to Bridgy Fed), as well as possibly from Threads if they’ve enabled following Mastodon accounts. (That was on their roadmap for 2024, but I don’t know if it’s working yet.) My posts will include the hashtag #PublicDomainDayCountdown. I’ll be focusing on works joining the US public domain that are of interest to me, but you’re also welcome to post about works of interest to you joining the public domain where you are, and use the same hashtag if you like.

Right now for me, and for many others I’ve talked to, it’s hard to think much beyond next Tuesday. But I hope these posts help us anticipate some good things coming in the future, built on the knowledge and creativity of the past. May we all see and help bring about a better future in the days to come!

The rules in the US are different for unpublished works, and for sound recordings that aren’t part of motion pictures. (I told you US copyright law was complicated.) But this January 1, along with publications from 1929, we will be welcoming sound recordings released in 1924 (which have a 100-year term) into the public domain, as well as many unpublished works by people who died in 1954. For lots more details and special cases, see Cornell University Library’s public domain table. ↩︎

Posted in publicdomain | Tagged PublicDomainDayCountdown | 13 Comments

Milestones for the Deep Backfile project

Posted on August 13, 2024 by John Mark Ockerbloom

Back in 2020 and 2021, while the Penn Libraries were largely closed, many of our librarians worked from home on the Deep Backfile project that I’ve written about here before. Faced with more demand than ever for online access to our collections while most users couldn’t go into our libraries, we researched the copyrights of some of the many thousands of journals, magazines, newspapers, and other serials in our collections. We hoped that documenting their public domain status could pave the way to making them more widely available not just to users of our library, but to readers in many other places as well.

By the time our library buildings fully reopened in 2021, our librarians had researched and reported on the copyright status of over 8,000 serials owned by Penn. They also found free online copies of at least some issues in over 2,000 of those serials. Our plan for the project was to do two reviews of every serial, one based on filling out a questionnaire we developed, and one done by someone with more expertise to review and edit the initially reported data, and to create a linked data record for it.

I’m pleased to announce that that second review is now complete. We now have copyright data for all the serials our librarians worked on now published as JSON linked data, connected with Wikidata, available in bulk on Github, and linked to free online content that our librarians found (via The Online Books Page)

When combined with other work, such as the JSON records we have now made for all other serials in our first-copyright-renewals list, our full Deep Backfile knowledge base now covers over 12,000 serials. The free serials available via The Online Books Page now amount to over 25,000 titles, many of them automatically imported from the Directory of Open Access Journals, but over 7,500 more with records we’ve created especially for The Online Books Page. (And that doesn’t include many thousands of additional older serials on HathiTrust that we list but don’t yet have serial-specific records for.)

Many thanks

As you can see in the Credits section of the project page, a lot of people have worked on the Deep Backfile since 2020. I’m grateful to all of them. I want to especially thank Rachelle Nelson, who managed and trained library workers, Jim Hahn and Kathleen Burlingame, who coded automated creation of Wikidata entries for the serials for many of the serials, Jie Li, who created many of the Online Books records for serials with free online content, and Beth Picknally Camden, Joe Zucca, and Emily Morton-Owens, who supported having library workers at Penn work on this project (among others) while our library buildings were largely closed. Some library workers also continued to put time into the project even after they reopened. Our most prolific contributors, Pete Sullivan and Nat Bender, each researched more than 1,000 serials. But there were also many other contributors who filled out questionnaires or created Wikidata entries, and whether they did it for just a few titles, or hundreds or more, their contributions are valued.

I hear regularly from readers around the world that use these and other serials online, thankful that they can access and read sources that were previously obscure or difficult to access in their research. The copyright information that the Deep Backfile team worked on has also been noticed by a number of digitization projects. The Internet Archive’s Serials in Microfilm project has been scanning microfilms and opening access for some of the serials we documented. HathiTrust conducted a pilot program for reviewing copyrights, based in part on our work, that led to them opening access to a small number of the serials we researched, and we now have a Deep Backfile table focusing on HathiTrust serial titles that might be openable there, if members are interested in supporting copyright review for them. As I noted in a talk I gave in January, we’ve also created another Deep Backfile table highlighting serials that have articles about them in English Wikipedia. We may also be able to take advantage of the information we’ve gathered for our own digitizations at Penn.

What’s next

We have a lot of information now about the rights and availability of many public domain serials. But there’s a lot of information we don’t yet have. The Penn Libraries own a lot of other serials we didn’t get to in our 2020-2021 survey. We don’t yet have information on a lot of the potentially public domain serials mentioned in Wikipedia. HathiTrust, the Internet Archive, and a lot of smaller sites now provide freely readable copies of serials we don’t yet list, including both public domain content and content freely licensed by the publishers or authors. And many of the large publishers and aggregators still include lots of public domain serial content behind paywalls.

So we could go in a variety of directions in further expanding our knowledge base. Which directions we focus on may depend on interest, support, and available resources. For now, I plan to take a short pause: first, for a vacation for much of the rest of this month, and then for working on some other digital library projects that have been in progress for a while (some of which you may hear about eventually).

But if you find this knowledge base of interest, I invite you to contribute more to it. To that end, I’ve adapted the questionnaire we developed for Penn librarians and now make it available for all of our Deep Backfile tables. Feel free to check it out, and fill it out for as few or as many serials as you like. Is there a serial on Wikipedia you’re interested in that we don’t yet have copyright information on? Feel free to select the “Contact us” link and answer the questions you see there. Annoyed when you hit an unwarranted paywall for an old or long-running journal at one of the big publishers, or a big aggregator? Go to its Deep Backfile table and help us document what’s public domain, and could be provided online by others even if a paywall exists elsewhere. Know of an authorized or public domain archive of one of the serials we mention? You can also use the “Contact us” links, or our general suggestion form, to let us know about additional content we can link to.

I’ve been gratified to regularly hear from readers who are using or are interested in the serials we now cover. And after I get back from vacation, I look forward to hearing more of what you’re interested in, and in reviewing any information you send us. Thank you again!

Posted in copyright, libraries, serials, sharing | 1 Comment

July 4 and the power of words

Posted on July 4, 2024 by John Mark Ockerbloom

The history of online books is intertwined with the history of Independence Day. It was on July 4, 1971 that Michael Hart, the founder of Project Gutenberg, was inspired to enter into a computer a copy of the Declaration of Independence. Here are some of the words he typed:

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty, and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed, That whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their Safety and Happiness.

195 years after the first publication of these words heralded the birth of a new nation, Michael Hart quoted them in what’s often considered the birth of the ebook. These words have staying power not just because the United States still exists as one of the world’s most powerful nations, and not just because they inspired people to fight for independence in the 18th century. They have staying power because they inspired people to establish and protect human dignity, equality, and empowerment in many other times and places. Consider, for instance, the Seneca Falls Declaration, echoing and reworking these words 72 years later to proclaim “all men and women are created equal”, and to launch efforts to secure American women’s rights to vote that would take another 72 years to win. Consider also the many declarations around the world that, as David Armitage notes, have drawn on this Declaration’s words for inspiration.

It’s often easy for July 4 to be an occasion for Americans to glibly congratulate ourselves. But even if the Declaration proclaims the truths above “self-evident”, the fulfillment of their promise has been anything but self-evident. Those who wrote and signed the Declaration knew they would be at risk for losing “our Lives, our Fortunes and our sacred Honor” for years afterwards, until those who fought and supported the American Revolution made Great Britain agree to a treaty recognizing the new nation. The writers of the Declaration themselves also failed to consistently live up to the ideals stated in it. Many of them, including the Declaration’s main drafter, continued to enslave other people for all of their lives. Indeed, the conflict between the principle that “all men are created equal” and the desire to keep some human beings subjugated to others eventually led to a war much bloodier than the American Revolution. During that war, Lincoln’s Gettysburg Address again invoked the Declaration’s founding principle of equality, and called the Civil War a test of whether “any nation so conceived and so dedicated can long endure.”

Our nation has endured many such tests in our 248 years, some of them resulting in victories, some of them resulting in setbacks. Sometimes the power of words alone has not been enough to guarantee people’s “unalienable Rights”. In the wake of the Civil War, three constitutional amendments were passed that seemed to guarantee the right to vote regardless of race, the equal protection of the laws, and the end of slavery. But less than a decade later, a bitterly fought close election led to a compromise that undid those guarantees. African Americans soon effectively lost their right to vote in much of the United States, until the 1965 Voting Rights Act. Exploitation of convict labor in tandem with a dramatic rise in imprisonment, often under flimsy pretexts, led to what Douglas A. Blackmon and others have called “slavery by another name“. In Plessy v. Ferguson, John Marshall Harlan insisted in his dissent that the Fourteenth Amendment “placed our free institutions upon the broad and sure foundation of the equality of all men before the law”. But he could not override the ruling of seven other Supreme Court justices in that case, who accepted a claim that enforced racial segregation could in practice be “separate but equal”.

These setbacks and others remind us that while “the People” can institute governments that respect our rights, “the people” can also let us down. We are people, after all: human beings who have unalienable rights and dignity, but who also have flaws, vices, weaknesses, and vulnerabilities. Any given institution, from a neighborhood association to the Supreme Court, can let us down too. They are also made of people, after all, ultimately doing what the people involved in them decide to do.

I know a number of people who are discouraged this July 4 in particular, as we’ve seen our Supreme Court once again let us down. They have issued rulings whose implications extend to “altering fundamentally the Forms of our Governments”, as the Declaration writers complained of King George. Particularly alarming among them is the ruling of six of the justices giving presidents immunity from prosecution so broad as to enable the kind of “despotism” the Declaration writers denounced in George III. Justices Sotomayor and Jackson noted this in their dissents, and other writers like Mark Joseph Stern have amply argued the case as well. The Supreme Court by itself cannot kill American democracy, but figuratively speaking, it’s placed a gun on the mantelpiece that a leading presidential candidate has demonstrated a willingness to fire upon regaining power. If we thought we could rely on the Supreme Court or any other institution unquestioningly, or assume that saying the right words to it would make it honor those words as we thought they should, we’ve been gravely disillusioned.

What our words can do, though, is help us keep working for what we know is right, until we overcome the setbacks with victories. In our libraries, we can bring these words together, and ensure that people continue to have access to them, for as long as that takes. We have all kinds of words that are needed for those struggles. We have words that inspire, words that persuade, and words that tell the stories of what has been, and what could be. We have words that tell us how democracies have formed, how they’ve failed, and how they’ve been restored. We have the words of all kinds of people that help us more easily see and treat those people as our equals, and deserving of the same unalienable rights that we have ourselves. We have words that illuminate the grievances behind long-standing conflicts, expose the atrocities of those conflicts, and show us how some of those conflicts have been resolved justly, even when few could imagine that outcome. We have words that help us pursue our happiness, and enable others to pursue their happiness as well.

Brought together, and freely shared, words have a lot of power. Authoritarians know this– that’s why so many of them try to ban books or coercively manipulate the way people communicate with each other. It’s why librarians raise the alarm when they see book bans on the rise, on whatever pretext. And it’s also a reason for us to hope that as we communicate with each other, organize with each other, and take action with each other, we can make real the promises of the words we treasure.

That’s why on this July 4, I celebrate the words of the Declaration of Independence that Michael Hart typed out and that I repeated at the top of this post. I’ll continue to organize and seek out more words that can be used to protect life, promote liberty, and pursue happiness. I resolve to do what I can, particularly between now and November, to work towards a society and government that honors the fundamental equality of all human beings, and the free consent of the governed. And I remember those who have engaged in similar struggles before me, including those Lincoln honored for fighting in another July long past, so that “government of the people, by the people, for the people, shall not perish from the earth.”

Posted in libraries, people, sharing | Tagged america, constitution, declaration-of-independence, history, politics | Comments Off

Free the sources!

Posted on January 16, 2024 by John Mark Ockerbloom

I gave a lightning talk this past Sunday when Mary¹ and I attended Wikipedia Day at the Columbia School of Journalism. Below is approximately what I said, with links to websites I showed during the talk, and few footnotes. Our thanks to Wikipedia NYC and the Brown Institute for Media Innovation for hosting the event!

I’m glad to be here to celebrate Wikipedia’s birthday this weekend. (And I’m looking forward to the cake².) Many of us are also celebrating some other things, like recently a new public domain day. And we’re not just celebrating famous characters like Mickey Mouse, but all kinds of cultural works and information resources that we write about in Wikipedia and use as sources for our articles.

And it’s not just works from 1928 like Steamboat Willie, but it’s also a lot of later works that are not so obviously in the public domain, like all the works as late as 1963 that didn’t renew their copyrights when required and works as late as 1989 published in the US without copyright notices.

Wikipedians have long recognized the value of public domain resources in the work we do. And if we can build up a better, more comprehensive and more reliable understanding of all the things are in the public domain, we can share more of it with the world, and use more of it in Wikipedia and other free and open projects.

I work at the Libraries at the University of Pennsylvania. Our collections have a lot of public domain source materials. A fair bit of our obvious public domain has been digitized. But we also have a lot of non-obvious “hidden” public domain materials. In particular, we have a lot of serials: journals, magazines, newspapers, newsletters, and the like. They’re often great sources for knowledge and culture you can’t find anywhere else, and a lot of this content from the 20th century is public domain because the publishers didn’t bother to maintain their copyrights.

So, a while back we started what we call the Deep Backfile serials project. We wrote some code to identify serials we held that might be in the public domain. That table of serials that we compiled was big, and we weren’t likely to research all of it any time soon. But then the COVID pandemic hit and we had to close the library buildings. We realized that it was a great opportunity to have many of our staff now working from home research the copyrights of lots of these serials so they could eventually be made available online not just during lockdowns, but afterwards as well.

To do this, we created a detailed questionnaire which allowed a librarian to consult some designated sites about any serial in our list, and once they’d answered all the questions they could and submitted the questionnaire, an expert would review it, and we’d post what we found out about what was copyrighted in that serial, what seemed to be public domain, and what could be freely put online³

Now some serials, like The New Yorker, had regular renewals, and pretty much all of their issues get the full 95 years of copyright. But for other serials, like, say, the Columbia Journalism Review, little or nothing was renewed in their early days, so in fact a number of their issues from the 1960s can freely go online (and some have).

It turns out there are lot more serials like the Columbia Journalism Review than there are are like The New Yorker. And we know that in part because while our library buildings were closed our librarians used that questionnaire to research over eight thousand serials.

I still have a few hundred of them left to review– regrettably, the only person regularly available for expert review was me– but everything we have reviewed we’ve published online as linked open data, with links to and from Wikidata, and to Wikipedia, and to any free and legal online copies of serial issues that we know about. And that’s a growing corpus, because digitizers like Internet Archive, HathiTrust, as well as any number of smaller independent digitizers have access to this information, and they can use it to make serial content available online free for all.

Now Wikipedia also has a lot of information about serials. In fact, when I ran a Wikidata query to find serials that had articles about them in English Wikipedia, I found well over ten thousand of them that were potentially or actually in the public domain, at least in part. And while Penn librarians have researched a lot of them, and I show what we’ve found out in this table, the majority of these serials described in Wikipedia don’t yet have expert-reviewed copyright information on them.

So, I hope I’m not going to regret this, but I’ve just taken that questionnaire that we used in the Penn Libaries, and I’ve now made it available for all of these serials described in Wikipedia.

So if you’re a Wikipedian interested in documenting and freeing these serials, you can fill out this questionnaire for any serial in this table you’re interested in. And I can review it, and publish what you’ve found as CC0 linked open data, and link it with Wikidata, so it’ll be available to anyone who’s willing and able to put public domain content from that serial online.

There’s a lot of work that can be done here, but I’m hoping there are are a few interested Wikipedians here who are interested in some of these serials, and we can try putting them into this Deep Backfile open knowledge base, and perhaps scale it up over time as we have in the Penn Libraries to document and free a lot of new sources in the public domain.

If this interests you, the Meetup page for this Wikipedia Day event has a link under Lightning talks to the Deep Backfile knowledge base I’ve created for serials covered in Wikipedia, and a link for contacting me. Thank you!

Footnotes

Mary Mark Ockerbloom has more experience editing Wikipedia than I do, and has been active in the Wikipedia community for years, co-leading a regular WikiSalon, and working on her own and as a Wikipedian in Residence for various organizations on topics like women writers, science communication, and countering disinformation. She’s currently available to work with new projects and organizations. ↩︎
As it turned out, Mary and I had to leave the event before the cake came out, so we could catch our train back home. But we hope it was good. ↩︎
At the time I gave this talk, the questionnaire link went to a full detailed form for a serial we hadn’t yet researched, with the title and ISSN pre-filled but the rest of the form blank. It might look different in the future after the serial is researched and included in our Deep Backfile knowledge base. To see an example of a blank detailed questionnaire, go to any serial in this table with “Unknown” in the “First renewal” column, and select the “Contact us” link at the right end of its table row. ↩︎

Posted in citizen librarians, open access, publicdomain, serials, wikipedia | Comments Off

The public domain gets the last word

Posted on December 31, 2023 by John Mark Ockerbloom

In 1857, work began on a revolutionary new dictionary covering the entire history of English word usage with example quotations. The first installment of A New English Dictionary on Historical Principles, covering A through Ant, appeared in 1884. The last, covering V-Z, was published in 1928. Its US copyright status has been murky, but as of tomorrow the entire first edition of what’s now known as the Oxford English Dictionary is definitively in the US public domain. #PublicDomainDayCountdown

Posted in publicdomain | Tagged PublicDomainDayCountdown | 2 Comments

Extra! Extra!

Posted on December 30, 2023 by John Mark Ockerbloom

Sometimes one work’s arrival in the public domain brings extras along with it. In two days, Ben Hecht and Charles MacArthur’s play The Front Page, which Peter Marks called “the best play about newspapering ever written”, joins the public domain. Assuming no other prior copyright dependencies, that also frees two films derived from it that have unrenewed copyrights: the 1931 film The Front Page, and the 1940 film His Girl Friday. Both are in the National Film Registry. #PublicDomainDayCountdown