I was pleased to read last week that the National Digital Newspaper Program, which has sponsored the digitization of over 1 million historically significant newspaper pages , has announced that it has expanded its scope to include content published up to 1963, as long as public domain status can be established. I’m excited about this initiative, which will surface content of historic interest that’s in many readers’ living memory. I’ve advocated opening access to serials up to 1963 for a long time, and have worked on various efforts to surface information about serial copyright renewals (like this one), to make it easier to find public domain serial content that can be made freely readable online. (In the US, renewal became automatic for copyrights secured after 1963, making it difficult to republish most newspapers after that date. Up till then, though, there’s a lot that can be put online.)
Copyright in contributions
Clearing copyright for newspapers after 1922 can be challenging, however. Relatively few newspapers renewed copyrights for entire issues– as I noted 10 years ago, none outside of New York City did before the end of World War II. But newspapers often aggregate lots of content from lots of sources, and determining the copyright status of those various pieces of content is necessary as well, as far as I can tell. While section 201(c) of copyright law normally gives copyright holders of a collective work, such as a magazine or newspaper, the right to republish contributions as part of that work, people digitizing a newspaper that didn’t renew its own copyright aren’t usually copyright holders for that newspaper. (I’m not a lawyer, though– if any legal experts want to argue that digitizing libraries get similar republication rights as the newspaper copyright holders, feel free to comment.)
As I mentioned in my last post, we at Penn are currently going through the Catalog of Copyright Entries to survey which periodicals have contributions with copyright renewals, and when those renewals started. (My previous post discussed this in the context of journals, but the survey covers newspapers as well.) Most of the contributions in the section we’re surveying are text, and we’ve now comprehensively surveyed up to 1932. In the process, we’ve found a number of newspapers that had copyright-renewed text contributions, even when they did not have copyright-renewed issues. The renewed contributions are most commonly serialized fiction (which was more commonly run in newspapers decades ago than it is now). Occasionally we’ll see a special nonfiction feature by a well-known author renewed. I have not yet seen any contribution renewals for straight news stories, though, and most newspapers published in the 1920s and early 1930s have not made any appearance in our renewal survey to date. I’ll post an update if I see this pattern changing; but right now, if digitizers are uncertain about the status of a particular story or feature article in a newspaper, searching for its title and author in the Catalog of Copyright Entries should suffice to clear it.
Photographs and advertisements
Newspapers contain more than text, though. They also include photos, as well as other graphical elements, which often appear in advertisements. It turns out, however, that the renewal rate for images is very low, and the renewal rate for “commercial prints”, which include advertisements, is even lower. There isn’t yet a searchable text file or database for these types of copyright renewals (though I’m hoping one can online before long, with help from Distributed Proofreaders), and in any case, images typically don’t have unambiguous titles one can use for searching. However, most news photographs were published just after they were taken, and therefore they have a known copyright year and specific years in which a renewal, if any, should have been filed. It’s possible to go through the complete artwork and commercial prints of any given year, get an overview of all the renewed photos and ads that exist, and look for matches. (It’s a little cumbersome, but doable, with page images of the Catalog of Copyright Entries; it will be easier once there are searchable, classified transcriptions of these pages.)
Fair use arguments may also be relevant. Even in the rare case where an advertisement was copyright-renewed, or includes copyright-renewed elements (like a copyrighted character), an ad in the context of an old newspaper largely serves an informative purpose, and presenting it there online doesn’t typically take away from the market for that advertisement. As far as I can tell, what market exists for ads mostly involves relicensing them for new purposes such as nostalgia merchandise. For that matter, most licensed reuses of photographs I’m aware of involve the use of high-resolution original prints and negatives, not the lower-quality copies that appear on newsprint (and that could be made even lower-grade for purposes of free display in a noncommercial research collection, if necessary). I don’t know if NDNP is planning to accommodate fair use arguments along with public domain documentation, but they’re worth considering.
Syndicated and reprinted content: A thornier problem
Many newspapers contain not only original content, but also content that originated elsewhere. This type of content comes in many forms: wire-service stories and photos, ads, and syndicated cartoons and columns. I don’t yet see much cause for concern about wire news stories; typically they originate in a specific newspaper, and would normally need to be renewed with reference to that newspaper. And at least as far as 1932, I haven’t yet seen any straight news stories renewed. Likewise, I suspect wire photos and national ads can be cleared much like single-newspaper photos and ads can be.
But I think syndicated content may be more of a sticky issue. Syndicated comics and features grew increasingly popular in newspapers in the 20th century, and there’s still a market for some content that goes back a long way. For instance, the first contribution renewal for the Elizabethan Star, dated September 8, 1930, is the very first Blondie comic strip. That strip soon became wildly popular, published by thousands of newspapers across the country. It still enjoys a robust market, with its official website noting it runs in over 2000 newspapers today. Moreover, its syndicator, King Features, also published weekly periodicals of its own, with issues as far back as 1933 renewed. (As far as I can tell, it published these for copyright purposes, as very few libraries have them, but according to WorldCat an issue “binds together one copy of each comic, puzzle, or column distributed by the syndicate in a given week”. Renew that, and you renew everything in it.) King Features remains one of the largest syndicators in the world. Most major newspapers, then, include at least some copyrighted (and possibly still marketable) material at least as far back as the early 1930s.
Selective presentation of serial content
The most problematic content of these old newspapers from a copyright point of view, though, is probably the least interesting content from a researcher’s point of view. Most people who want to look at a particular locale’s newspaper want to see the local content: the news its journalists reported, the editorials it ran, the ads local businesses and readers bought. The material that came from elsewhere, and ran identically in hundreds of other newspapers, is of less research interest. Why not omit that, then, while still showing all the local content?
This should be feasible given current law and technology. We know from the Google and Hathitrust cases that fair use allows completely copyrighted volumes to be digitized and used for certain purposes like search, as long as users aren’t generally shown the full text. And while projects like HathiTrust and Chronicling America now typically show all the pages they scan, commonly used digitized newspaper software can either highlight or blank out not only specific pages but even the specific sections of a page in which a particular article or image appears.
This gives us a path forward for providing access to newspapers up to 1963 (or whatever date the paper started being renewed in its entirety). Specifically, a library digitization project can digitize and index all the pages, but then only expose the portions of the issues it’s comfortable showing given its copyright knowledge. It can summarize the parts it’s omitting, so that other libraries (or other trusted collaborators) can research the parts it wasn’t able to clear on its own. Sections could then be opened up as researchers across the Internet found evidence to clear up their status. Taken as a whole, it’s a big job, but projects like the Copyright Review Management System show how distributed copyright clearance can be feasibly done at scale.
Moreover, if we can establish a workable clearance and selective display process for US newspapers, it will probably also work for most other serials published in the US. Most of them, whether magazines, scholarly journals, conference proceedings, newsletters, or trade publications, are no more complicated in their sources and structures than newspapers are, and they’re often much simpler. So I look forward to seeing how this expansion in scope up to 1963 works out for the National Digital Newspaper Program. And I hope we can use their example and experience to open access to a wider variety of serials as well.