Public Domain Day 2015: Ending our own enclosures

It’s the start of the new year, which, as many of my readers know, marks another Public Domain Day, when a year’s worth of creative work becomes free for anyone to use in many countries.

In countries where copyrights have been extended to life plus 70 years, works by people like Piet Mondrian, Edith Durham, Glenn Miller, and Ethel Lina White enter the public domain.  In countries that have resisted ongoing efforts to extend copyrights past life + 50 years, 2015 sees works by people like Flannery O’Connor, E. J. Pratt, Ian Fleming, Rachel Carson, and T. H. White enter the public domain. And in the US, once again no published works enter the public domain due to an ongoing freeze in copyright expirations (though some well-known works might have if we still had the copyright laws in effect when they were created.)

But we’re actually getting something new worth noting this year.  Today we’re seeing scholarship-quality transcriptions of tens of thousands of early English books — the EEBO Text Creation Partnership Phase I texts — become available free of charge to the general public for the first time.  (As I write this, the books aren’t accessible yet, but I expect they will be once the folks in the project come back to work from the holiday.)  (UpdateIt looks like files and links are now on Github; hopefully more user-friendly access points are in the works as well.)

This isn’t a new addition to the public domain; the books being transcribed have been in the public domain for some time.  But it’s the first time many of them are generally available in a form that’s easily searchable and isn’t riddled with OCR errors.  For the rarer works, it’s the first time they’re available freely across the world in any form.  It’s important to recognize this milestone as well, because taking advantage of the public domain requires not just copyrights expiring or being waived, but also people dedicated to making the public domain available to the public.

And that is where we who work in institutions dedicated to learning, knowledge, and memory have unique opportunities and responsibilities.   Libraries, galleries, archives, and museums have collected and preserved much of the cultural heritage that is now in the public domain, and that is often not findable– and generally not shareable– anywhere else.  That heritage becomes much more useful and valuable when we share it freely with the whole world online than when we only give access to people who can get to our physical collections, or who can pay the fees and tolerate the usage restrictions of restricted digitized collections.

So whether or not we’re getting new works in the public domain this year, we have a lot of work to do this year, and the years to follow, in making that work available to the world.  Wherever and whenever possible, those of us whose mission focuses more on knowledge than commerce should commit to having that work be as openly accessible as possible, as soon as possible.

That doesn’t mean we shouldn’t work with the commercial sector, or respect their interests as well.  After all, we wouldn’t have seen nearly so many books become readable online in the early years of this century if it weren’t for companies like Google, Microsoft, and ProQuest digitizing them at much larger scale than libraries had previously done on their own.  As commercial firms, they’re naturally looking to make some money by doing so.  But they need us as much as we need them to digitize the materials we hold, so we have the power and duty to ensure that when we work with them, our agreements fulfill our missions to spread knowledge widely as well as their missions to earn a profit.

We’ve done better at this in some cases than in others.   I’m happy that many of the libraries who partnered with Google in their book scanning program retained the rights to preserve those scans themselves and make them available to the world in HathiTrust.   (Though it’d be nice if the Google-imposed restrictions on full-book downloads from there eventually expired.)  I’m happy that libraries who made deals with ProQuest in the 1990s to digitize old English books that no one else was then digitizing had the foresight to secure the right to make transcriptions of those books freely available to the world today.  I’m less happy that there’s no definite release date yet for some of the other books in the collection (the ones in Phase II, where the 5-year timer for public release doesn’t count down until that phase’s as-yet-unclear completion date), and that there appears to be no plan to make the page images freely available.

Working together, we in knowledge institutions can get around the more onerous commercial restrictions put on the public domain.  I have no issue with firms that make a reasonable profit by adding value– if, for instance, Melville House can quickly sell lots of printed and digitally transcribed copies of the US Senate Torture report for under $20, more power to them.  People who want to pay for the convenience of those editions can do so, and free public domain copies from the Senate remains available for those who want to read and repurpose them.

But when I hear about firms like Taylor and Francis charging as much as $48 to nonsubscribers to download a 19th century public domain article from their website for the Philosophical Magazine, I’m going to be much more inclined to take the time to promote free alternatives scanned by others.  And we can make similar bypasses of not-for-profit gatekeepers when necessary.  I sympathize with Canadian institutions having to deal with drastic funding cuts, which seem to have prompted Early Canadiana Online to put many of their previously freely available digitized books behind paywalls– but I still switched my links as soon as I could to free copies of most of the same books posted at the Internet Archive.  (I expect that increasing numbers of free page scans of the titles represented in Early English Books Online will show up there and elsewhere over time as well, from independent scanning projects if not from ProQuest.)

Assuming we can hold off further extensions to copyright (which, as I noted last year, is a battle we need to show up for now), four years from now we’ll finally have more publication copyrights expiring into the public domain in the US.  But there’s a lot of work we in learning and memory institutions can do now in making our public domain works available to the world.  For that matter, there’s a lot we can do in making the many copyrighted works we create available to the world in free and open forms.  We saw a lot of progress in that respect in 2014: Scholars and funders are increasingly shifting from closed-access to open-access publication strategies.  A coalition of libraries has successfully crowdfunded open-access academic monographs for less cost to them than for similar closed-access print books.  And a growing number of academic authors and nonprofit publishers are making open access versions of their works, particularly older works, freely available to world while still sustaining themselves.  Today, for instance, I’ll be starting to list on The Online Books Page free copies of books that Ohio State University Press published in 2009, now that a 5-year-limited paywall has expired on those titles.  And, as usual, I’m also dedicating a year’s worth of 15-year-old copyrights I control (in this case, for work I made public in 2000) to the public domain today, since the 14-year initial copyright term that the founders of the United States first established is plenty long for most of what I do.

As we celebrate Public Domain Day today, let’s look to the works that we ourselves oversee, and resolve to bring down enclosures and provide access to as much of that work as we can.

Public Domain Day 2014: The fight for the public domain is on now

New Years’ Day is upon us again, and with it, the return of Public Domain Day, which I’m happy to see has become a regular celebration in many places over the last few years.  (I’ve observed it here since 2008.)  In Europe, the Open Knowledge Foundation gives us a “class picture” of authors who died in 1943, and whose works are now entering the public domain there and in other “life+70 years” countries.  Meanwhile, countries that still hold to the Berne Convention’s “life+50 years” copyright term, including Canada, Japan, and New Zealand, and many others, get the works of authors who died in 1963.  (The Open Knowledge Foundation also has highlights for those countries, where Narnia/Brave-New-World/purloined-plums crossover fanfic is now completely legal.)  And Duke’s Center for the Study of the Public Domain laments that, for the 16th straight year, the US gets no more published works entering the public domain, and highlights the works that would have gone into the public domain here were it not for later copyright extensions.

It all starts to look a bit familiar after a few years, and while we may lament the delays in works entering the public domain, it may seem like there’s not much to do about it right now.  After all, most of the world is getting another year’s worth of public domain again on schedule, and many commentators on the US’s frozen public domain don’t see much changing until we approach 2019, when remaining copyrights on works published in 1923 are scheduled to finally expire.  By then, writers like Timothy Lee speculate, public domain supporters will be ready to fight the passage of another copyright term extension bill on Congress like the one that froze the public domain here back in 1998.

We can’t afford that sense of complacency.  In fact, the fight to further extend copyright is raging now, and the most significant campaigns aren’t happening in Congress or other now-closely-watched legislative chambers.  Instead, they’re happening in the more secretive world of international trade negotiations, where major intellectual property hoarders have better access than the general public, and where treaties can be used to later force extensions of the length and impact of copyright laws at the national level, in the name of “harmonization”.   Here’s what we currently have to deal with:

Remaining Berne holdouts are being pushed to add 20 more years of copyright.  Remember how I said that Canada, Japan, and New Zealand were all enjoying another year of “life+50 years” copyright expirations?  Quite possibly not for long.  All of those countries are also involved in the Trans-Pacific Partnership (TPP) negotiations, which include a strong push for more extensive copyright control.  The exact terms are being kept secret, but a leaked draft of the intellectual property chapter from August 2013 shows agreement by many of the countries’ trade negotiators to mandate “life+70 years” terms across the partnership.  That would mean a loss of 20 years of public domain for many TPP countries, and ultimately increased pressure on other countries to match the longer terms of major trade partners.  Public pressure from citizens of those countries can prevent this from happening– indeed, a leak from December hints that some countries that had favored extensions back in August are reconsidering.  So now is an excellent time to do as Gutenberg Canada suggests and let legislators and trade representatives know that you value the public domain and oppose further extensions of copyright.

Life+70 years countries still get further copyright extensions.   The push to extend copyrights further doesn’t end when a country abandons the “life+50 years” standard.  Indeed, just this past year the European Union saw another 20 years added on to the terms of sound recordings (which previously had a 50-year term of their own in addition to the underlying life+70 years copyrights on the material being recorded.)  This extension is actually less than the 95 years that US lobbyists had pushed for, and are still pushing for in the Trans-Pacific Partnership, to match terms in the US.

(Why does the US have a 95-year term in the first place that it wants Europe to harmonize with?  Because of the 20-year copyright extension that was enacted in 1998 in the name of harmonizing with Europe.  As with climbers going from handhold to handhold and foothold to foothold higher in a cliff, you can always find a way to “harmonize” copyright ever upward if you’re determined to do so.)

The next major plateau for international copyright terms, life+100 years, is now in sight.  The leaked TPP draft from August also includes a proposal from Mexico to add yet another 30 years onto copyright terms, to life+100 years, which that country adopted not many years ago.  It doesn’t have much chance of passage in the TPP negotiations, where to my knowledge only Mexico has favored the measure.   But it makes “life+70” seem reasonable in comparison, and sets a precedent for future, smaller-scale trade deals that could eventually establish longer terms.  It’s worth remembering, for instance, that Europe’s “life+70” terms started out in only a couple of countries, spread to the rest of Europe in European Union trade deals, and then to the US and much of the rest of the world.  Likewise, Mexico’s “life+100” proposal might be more influential in smaller-scale Latin American trade deals, and once established there, spread to the US and other countries.  With 5 years to go before US copyrights are scheduled to expire again in significant numbers, there’s time for copyright maximalists to get momentum going for more international “harmonization”.

What’s in the public domain now isn’t guaranteed to stay there.  That’s been the case for a while in Europe, where the public domain is only now getting back to where it was 20 years ago.  (The European Union’s 1990s extension directive rolled back the public domain in many European countries, so in places like the United Kingdom, where the new terms went into effect in 1996, the public domain is only now getting to where it was in 1994.)  But now in the US as well, where “what enters the public domain stays in the public domain” has been a long-standing custom, the Supreme Court has ruled that Congress can in fact remove works from the public domain in certain circumstances.   The circumstances at issue in the case they ruled on?  An international trade agreement— which as we’ve seen above is now the prevailing way of getting copyrights extended in the first place.   Even an agreement that just establishes life+70 years as a universal requirement, but doesn’t include the usual grandfathered exception for older works, could put the public domain status of works going back as far the 1870s into question, as we’ve seen with HathiTrust international copyright determinations.

But we can help turn the tide.  It’s also possible to cooperate internationally to improve access to creative works, and not just lock it up further.  We saw that start to happen this past year, for instance, with the signing of the Marrakesh Treaty on copyright exceptions and limitations, intended to ensure that those with disabilities that make it hard to read books normally can access the wealth of literature and learning available to the rest of the world.  The treaty still needs to be ratified before it can go into effect, so we need to make sure ratification goes through in our various countries.  It’s a hopeful first step in international cooperation increasing access instead of raising barriers to access.

Another improvement now being discussed is to require rightsholders to register ongoing interest in a work if they want to keep it under copyright past a certain point.  That idea, which reintroduces the concept of “formalities”, has been floated some prominent figures like US Copyright Register Maria Pallante.  Such formalities would alleviate the problem of “orphan works” no longer being exploited by their owners but not available for free use.   (And a sensible, uniform formalities system could be simpler and more straightforward than the old country-by-country formalities that Berne got rid of, or the formalities people already accept for property like motor vehicles and real estate.)  Pallante’s initial proposal represents a fairly small step; for compatibility with the Berne Convention, formalities would not be required until the last 20 years of a full copyright term.  But with enough public support, it could help move copyright away from a “one size fits all” approach to one that more sensibly balances the interests of various kinds of creators and readers.

We can also make our own work more freely available.  For the last several years, I’ve been applying my own personal “formalities” program, in which I release into the public domain works I’ve created that I don’t need to further limit.  So in keeping with the original 14-year renewable terms of US copyright law, I now declare that all work that I published in 1999, and that I have sole control of rights over, is hereby dedicated to the public domain via a CC0 grant.  (They join other works from the 1900s that I’ve also dedicated to the public domain in previous years.)  For 1999, this mostly consists of material I put online, including all versions of  Catholic Resources on the Net, one of the first websites of its kind, which I edited from 1993 to 1999.  It also includes another year’s history of The Online Books Page.

Not that you have to wait 14 years to free your work.  Earlier this year, I released much of the catalog data from the Online Books Page into the public domain.  The metadata in that site’s “curated collection” continues to be released as open data under a CC0 grant as soon as it is published, so other library catalogs, aggregators, and other sites can freely reuse, analyze, and republish it as they see fit.

We can do more with work that’s under copyright, or that seems to be.  Sometimes we let worries about copyright keep us from taking full advantage of what copyright law actually allows us to do with works.  In the past couple of years, we saw court rulings supporting the rights of Google and HathiTrust to use digitized, but not publicly readable, copies of in-copyright books for indexing, search, and preservation purposes.   (Both cases are currently being appealed by the Authors Guild.)  HathiTrust has also researched hundreds of thousands of book copyrights, and as of a month ago they’d enabled access to nearly 200,000 volumes that were classified as in-copyright under simple precautionary guideliness, but determined to be actually in the public domain after closer examination.)

In the coming year, I’d like to see if we can do similar work to open up access to historical journals and other serials as well.  For instance, Duke’s survey of the lost public domain mentions that articles from 1957 major science journals like Nature, Science, and JAMA are behind paywalls, but as far as I’ve been able to tell, none of those three journals renewed copyrights for their 1957 issues.  Scientists are also increasingly making current work openly available through open access journals, open access repositories, and even discipline-wide initiatives like SCOAP3, which also debuts today.

There are also some potentially useful copyright exemptions for libraries in Section 108 of US copyright law that we could use to provide more access to brittle materials, materials nearing the end of their copyright term, and materials used by print-impaired users.

Supporters of the public domain that sit around and wait for the next copyright extension to get introduced into their legislatures are like generals expecting victory by fighting the last warThere’s a lot that public domain supporters can do, and need to do, now.  That includes countering the ongoing extension of copyright through international trade agreements, promoting initiatives to restore a proper balance of interest between rightsholders and readers, improving access to copyrighted work where allowed, making work available that’s new to the public domain (or that we haven’t yet figured out is out of copyright), and looking for opportunities to share our own work more widely with the world.

So enjoy the New Year and the Public Domain Day holiday.  And then let’s get to work.

Updates on library linking, Wikipedia, and what you can do

I’m gratified for the positive response I’ve been getting to the Forward To Libraries service I first introduced last month.  It really took off when I announced the templates for linking to libraries from Wikipedia a couple of weeks ago.   They’ve been written up in places like Boing Boing and in Wikipedia’s own Signpost newsletter.   The service now includes more than 150 libraries throughout the English-speaking world.  Various Wikipedia editors are also adding the link templates to various articles–  besides the handful I added myself, more than 450 have been added by other editors at this writing.  And I’ve heard from numerous librarians who now want to start editing Wikipedia themselves, both to add library links and to otherwise improve articles.  (Here’s how to become a Wikipedia editor.)

So far, I’ve largely provided this service on my own, with support from the University of Pennsylvania Libraries.   But I’d like to make the service more useful, and could use some help.  If you’re interested, here are some things you might want to know:

Some libraries are easier to link than others.   If you’re using one of many standard library catalogs or discovery systems, and you haven’t made substantial modifications to it, it’s easy for me to add your system. I basically just record what software you’re using and where on the Web the service runs, run some test searches to verify your system, and you’re good to go.  If you’re using a more customized, obscure, or home-grown system, I might still be able to add links to it, but it may take me more effort to figure out how to make useful search links into the system.  Any information you can provide would be helpful.  There are also certain off-the-shelf systems that I have problems with.  Many Polaris systems, for example, will give a “session timed out” message the first time you try to follow a search link into the system.   (Back up and try the link again, and everything will be fine for some time afterwards.)  Some other systems don’t seem to support deep search links in any consistent way that I’ve been able to determine, and not just some very old session-based systems, but also EBSCO’s fairly new EDS discovery platform.

I’ve determined ways to link into these various systems from reading various documentation files I’ve found on the public Internet, along with some reverse-engineering of public web sites.  If you know of better ways to link to some of these systems that I haven’t yet figured out myself, and this information can be made public, let me know.

For now, I’m declining to list libraries that don’t have many English-language subject or Library of Congress name headings, because the results of English searches in those libraries will be misleadingly incomplete.  But I’m considering ways to include translated searches, where the data to support this is available, for a wider range of countries.  (VIAF already provides much relevant data for names.)

The most popular new Wikipedia Library resource template is also controversial, and might be modified or deleted.   I provide a number of different templates for linking from Wikipedia to libraries, including the inlined text templates “Library resources about” and “Library resources by“, and the all-in-one sidebar template “Library resources box“. By far the most used of these templates has been the Library resources box.   It’s easy to spot in an article, it organizes links clearly, and it’s easy for editors to recognize as a template that they can add to articles they find of interest.  But some Wikipedians, including at least one Wikipedia admin, have objected to the template.  They cite style guidelines that say external link templates should not use boxes or other graphical elements, but only appear as inlined text.  I’ve defended the boxes, noted how other library-related external links commonly appear in boxes, and proposed ways to address various Wikipedian concerns.   But it’s ultimately up to the Wikipedia community to determine whether or how library links will appear in Wikipedia articles.  To find out more about the issues, see the Library resources box talk page.  And if you’re a Wikipedia editor or user, feel free to weigh in on that page or other relevant forums.

I’m exploring ways to make it easier for readers to get to our libraries.  For one, I’m starting to record IP ranges for some institutions, so that local network users can follow “resources in your library” links straight to the institution’s library, without having to first register a preference.  (Users can still register a different preference if they want.)  IP-based routing is an experimental service, initially being provided to a limited number of institutions, and I may modify or withdraw it in the future.  If you’d like me to consider it for your institution, you can submit a request, with the relevant IP ranges (preferably in CIDR format) in the “anything we should know?” field.  Note that the IP ranges you submit will be published as part of the library data I’m sharing for this project.

I’m starting to share my work on Github.  There is now a Github repository with selected data and code for the FTL project.  In it, you’ll find the data I use to link to the libraries enrolled in the service, and you’ll also see the code for the main CGI script used to forward readers to those libraries.   You can’t yet run the service out of the box yourself with the code and data provided so far, but I hope that what’s there will help people understand how the service works, and possibly implement similar services themselves if they’re so inclined.  The data’s released under CC0, so you can reuse it however you like; and the code is open-source licensed under the Educational Community License 2.0.  I hope to add more data and code over time, and I’m happy to hear suggestions for enhancements and improvements.

I’m hoping that as more people get involved, the service will improve, library resources will become more reachable online, and Wikipedia will become a more useful resource as well.  If you’d like to get involved yourself, I’d love to hear what you’re up to, and what suggestions you might have.