Everybody's Libraries

January 1, 2012

Public Domain Day 2012: Five things we can do in the US

Filed under: copyright,libraries,online books,open access — John Mark Ockerbloom @ 10:24 am

It’s New Year’s Day again, and in much of the world, this means another year’s worth of works enter the public domain.  That’s a cause for celebration, as Europe and many other countries that have “life+70 years” copyright terms welcome works by James Joyce, Virginia Woolf, Jelly Roll Morton, and Elizabeth von Arnim into the public domain.  The Communia Project’s Public Domain Day website focuses on works by these and many other authors that are entering (in many cases, re-entering) the public domain in “life+70 years” countries.  Meanwhie, folks in Canada, New Zealand, and other countries that have held the line at the “life+50 years” terms of the Berne Convention can now freely enjoy the works of people like James Thurber, Ernest Hemingway, and H.D.

There’s not so much excitement about Public Domain Day in the US, where no published works are scheduled to enter the public domain for another 7 years, due to a 20-year copyright extension enacted in 1998.  But Americans don’t have to simply sigh and contemplate what might have been if our copyright terms hadn’t been extended.  The new year still provides a number of important opportunities for Americans to improve access to the public domain.

1. Find and free newly public domain unpublished works

Some works are going into the public domain in the US today: works never published prior to 2003 (or copyrighted under US law prior to 1978) by authors who died in 1941– the same authors whose published works go into the public domain in Europe today.

But who would care about such obscure works? one might ask.  Well, if you’re at all interested in understanding the dense, allusion-laden fiction of Joyce, or the psychology of Woolf, or the jurisprudential thinking of Louis Brandeis, or the inner lives of any of the rest of the “class of 1941″, having the right to freely access, publish, and build on their unpublished works can be crucial.

Up until now, for instance, scholars studying James Joyce have often been frustrated by sharp restrictions and legal threats made by the administrator of Joyce’s literary estate.  In 2008, Rebecca Ganz characterized the administrator thus: “[His] primary purpose is to quell any scholarship that he finds distasteful or an invasion of his family’s privacy. He has a history of harassing authors and artists until they buckle under the strain of trying to obtain legal rights to quote from the late author’s writings.”  Scholars wishing to invoke Joyce’s unpublished works in their work have either had to undertake multi-year legal battles, or cut back on the lines of inquiry they might otherwise pursue.

American libraries and archives have many illuminating papers by authors who died in 1941– even non-US authors like Joyce and Woolf.  US digitizers, librarians, and archivists can open up and publicize these works.   In some cases, we’re uniquely positioned to do so, since their unpublished works may still be under copyright in some other countries.

2. Increase worldwide availability of public domain works

Many of the millions of digitized books on the Internet are hosted in the US, in large-scale repositories like Google Books, HathiTrust, and the Internet Archive.  Many of these services give limited access to non-US readers or materials.  Google and HathiTrust, for instance, limit non-US access by default to books published as long as 140 years ago, to avoid falling afoul of “life+70 years” copyright terms abroad.  JSTOR likewise limits access to non-US journal volumes published in 1870 or later.

With another year’s worth of copyrights expiring in “life+70 years” countries, it should be safe for these US-based services to also open up worldwide access to another year’s worth of works, further freeing up the public domain.  HathiTrust is also willing to manually review copyrights on specific books to open up access.  If you come across any books in HathiTrust solely by authors who died in 1941 (or before) that are currently labeled only as “public domain in the United States”, you can request that they review it for opening up access worldwide.  Just use the “Feedback” button at the bottom of the book’s HathiTrust page, or the suggestion form on my Online Books Page; and make sure you ask specifically for non-US access.

3. Restore access to obscure copyrighted works from 1936 (and earlier)

After libraries and archives expressed concerns about the fate of obscure works under longer copyright terms, Congress included a special exemption in their 1998 copyright tem extension.  The exemption, codified as section 108(h) of the copyright law, states that “during the last 20 years of any term of copyright of a published work, a library or archives, including a nonprofit educational institution that functions as such, may reproduce, distribute, display, or perform in facsimile or digital form a copy or phonorecord of such work, or portions thereof, for purposes of preservation, scholarship, or research”, under certain conditions.  In particular, if the institution finds, after a reasonable investigation, that such a work is not “subject to normal commercial exploitation” (such as by being in print) and cannot “be obtained at a reasonable price”, and no rightsholder has filed a claim otherwise, the work qualifies for this special exemption.  As of this year’s Public Domain Day, qualifying publications from 1936 join what is now 14 years of works in this category.

So far, I have found very little digitized content online where this exemption is explicitly invoked.  (There are advantages to explicitly doing so, both because it helps clarify the right to use the material, and helps prevent inadvertent unauthorized propagation of the works, such as the commercial reprints of digitized books that are now common on many large bookselling sites.)  Yet many of the works in HathiTrust’s (currently suspended) orphan works initiative, and in the Internet Archive’s lending library, and more besides, could well qualify for this treatment– and unlike orphan works, where legislation has yet to be passed, the exemption for these materials is already explicitly authorized by statute.

Providing online access for these works is not without controversy.  A 2002 article by lawyer Mary Minow details some of the potential possibilities and risks.   While she concludes that libraries can put such works on the Web, the recent Author’s Guild complaint in its lawsuit against HathiTrust includes some push-back against this idea. But as the public domain in the US recedes further into history, and digital library projects increasingly look for ways to make our cultural heritage available online, American libraries would do well to proactively establish and exercise these rights for older works now languishing in obscurity.

4. Strengthen and sustain coalitions for reasonable copyright limits

The curtailment of the public domain is just one aspect of the overreach of copyright law in the US and elsewhere.  Right now, Congress is considering two bills, the Stop Online Piracy Act (SOPA) and the PROTECT IP Act (PIPA), whose enforcement provisions threaten to disrupt the core structures of the Internet and enable far-reaching censorship, in the name of stopping piracy.  Supporters of these bills hoped to have them passed by Christmas, but opposition from both “left” and “right” sides of the political spectrum has slowed the process down, caused some companies to withdraw support, and led to the proposal of less harmful alternatives for fighting piracy.

It’s still quite possible that SOPA and PIPA will pass, though.   Public Domain Day provides an opportunity for Americans to reflect on some of the good reasons for limiting the power and scope of copyright enforcement, and to redouble efforts to keep those limits reasonable.  Moreover, a coalition that can stop SOPA and PIPA can also work to prevent further extensions of copyright terms.  This can ensure that Americans will have more to celebrate in Public Domain Days to come– especially starting in 2019, when the remaining 1923 copyrights should finally expire in the US.

5. Give copyrights of your own to the public domain

Of course, those wishing to maximize public access and use of their works don’t have to wait for their copyrights to expire on their own.  They can dedicate them to the public domain any time they want.  Public Domain Day is a particularly auspicious time to make such gifts, no matter what country you’re in.  And with tools like the CC0 declaration, it’s easier than ever to do so.

A few years ago, I started an annual personal tradition of reviewing copyrights to works I’d created more than 14 years ago (the original initial term of copyright enacted by the founders of the US, and also approximately the ideal copyright term given in a recent economic analysis) and dedicating works to the public domain that I didn’t feel needed further copyright.  Accordingly, today I dedicate all the work of my creation that I published in 1997, and for which I still control rights, to the public domain.  For me, this consists primarily of websites like The Online Books Page as of that year, and other online writings.  But others have dedicated more high-profile material to the public domain after the same term.   And I’d be very happy to hear from others who are making similar dedications today (whether or not it’s after 14 years).

So, happy Public Domain Day to everyone in the US and elsewhere!  We all have things to celebrate, and things we can do, in the name of the public domain.

September 27, 2011

Libraries: Be careful what your web sites “Like”

Filed under: crimes and misdemeanors,data,libraries,people,privacy — John Mark Ockerbloom @ 6:15 pm

Imagine you’re working in a library, and someone with a suit and a buzz cut comes up to you, gestures towards a patron who’s leaving the building, and says “That guy you were just helping out; can you tell me what books he was looking at?”

Many librarians would react to this request with alarm.  The code of ethics adopted by the American Library Association states “We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”  Librarians will typically refuse to give such information without a carefully-verified search warrant, and many are also campaigning against the particularly intrusive search demands authorized by the PATRIOT Act.

Yet it’s possible that the library in this scenario is routinely giving out that kind of information, without the knowledge or consent of librarians or patrons, via its web site.  These days, many sites, including those of libraries, invoke a variety of third-party services to construct their web pages.  For instance, some library sites use Google services to analyze site usage trends or to display book covers.  Those third party services often know what web page has been visited when they’re invoked, either through an identifier in the HTML or Javascript code used to invoke the service, or simply through the Referer information passed from the user’s web browser.

Patron privacy is particularly at risk when the third party also knows the identity of users visiting sensitive pages (like pages disclosing books they’re interested in).  The social networking sites that many library patrons use, for instance, can often track where their users go on the Web, even after they’ve left the social sites themselves.

For instance, if you go to the website of the Farmington Public Library (a library I used a lot when growing up in Connecticut), and search through their catalog, you may see Facebook “Like” buttons on the results.  On this page, for example, you may see that four people (possibly more by the time you read this) have told Facebook they Liked the book Indistinguishable from Magic.  Now, you can probably easily guess that if you click the Like button, and have a Facebook account, then Facebook will know that you liked the book too.  No big surprise there.

But what you can’t easily tell is that  Facebook is informed you’ve looked at this book page, even if you don’t click on anything.  If you’re a Facebook user and haven’t logged out– and for a while recently, even if you have logged out– Facebook knows your identity.  And if Facebook knows who you are and what you’re looking at, it has the power to pass along this information. It might do it through a “frictionless sharing” app you decided to try.  Or it might quietly provide it to organizations that it can sell your data to as permitted in its frequently changing data use policies.  (Which for a while even included tracking non-members.)

For some users, it might not be a big deal if it’s generally known what books they’re looking at online. But for others it definitely is a big deal, at least some of the time.  The problem with third-party inclusions like the Facebook “Like” button in catalogs is that library patrons may be denied the opportunity to give informed consent to sharing their browsing with others.  Libraries committed to protecting their patron’s privacy as part of their freedom to read need to carefully consider what third party services they invite to “tag along” when patrons browse their sites.

This isn’t just a Facebook issue.  Similar issues come up with other third-party services that also track individuals, as for instance Google does.  Libraries also have good reasons to partner with third party sites for various purposes.  For some of these purposes, like ebook provision, privacy concerns are fairly well understood and carefully considered by most libraries.  But librarians might not keep as close track of the development of their own web sites, where privacy leaks can spring up unnoticed.

So if any of your web sites (especially your online catalogs or other discovery and delivery services) use third party web services, consider carefully where and how they’re being invoked.  For each third party, you should ask what information they can get from users browsing your web site, what other information they have from other sources (like the “real names” and exact birthdates that sites like Facebook and Google+ demand), and what real guarantees, if any, they make about the privacy of the information.  If you can’t easily get satisfactory answers to these questions, then reconsider your use of these services.

June 15, 2011

A digital public library we still need, and could build now

Filed under: citizen librarians,copyright,libraries,people,sharing — John Mark Ockerbloom @ 12:39 pm

It’s been more than half a year since the Digital Public Library of America project was formally launched, and I’m still trying to figure out what the project organizers really want it to be.  The idea of “a digital library in service of the American public” is a good one, and many existing digital libraries already play that role in a variety of ways.  As I said when I christened this blog, I’m all for creating a multitude of libraries to serve a diversity of audiences and information needs.

At a certain point after an enthusiastic band of performers says “Let’s put on a show!”, though, someone has to decide what their show’s going to be about, and start focusing effort there.  So far, the DPLA seems to be taking an opportunistic approach.  Instead of promulgating a particular blueprint for what they’ll do, they’re asking the community for suggestions, in a “beta sprint” that ends today.   Whether this results in a clear distinctive direction for the project, or a mishmash of ideas from other digitization, aggregation, preservation, and public service initiatives, remains to be seen.

Just about every digital project I’ve seen is opportunistic to some extent.   In particular, most of the big ones are opportunistic when it comes to collection development.  We go after the books, documents, and other knowledge resources that are close to hand in our physical collections, or that we find people putting on the open web, or that our users suggest, or volunteer to provide on their own.

There are a number of good reasons for this sort of opportunism.  It lets us reuse work that we don’t have to redo ourselves.  It can inform us of audience interests and needs (at least as far as the interests of the producers we find align with the interests of the consumers we serve).  And it’s cheap, and that’s nothing to sneer at when budgets are tight.

But the public libraries that my family prefers to use don’t, on the whole, have opportunistically built collections.  Rather, they have collections shaped primarily by the needs of their patrons, and not primarily by the types of materials they can easily acquire.   The “opportunistic” community and school library collections I’ve seen tend to be the underfunded ones, where books in which we have yet to land on the Moon, the Soviet Union is still around, or Alaska is not yet a state may be more visible than books that reflect current knowledge or world events.  The better libraries may still have older titles in their research stacks, but they lead with books that have current relevance to their community, and they go out of their way to acquire reliable, readable resources for whatever information needs their users have.  In other words, their collections and services are driven by  demand, not supply.

In the digital realm, we have yet to see a library that freely provides such a digital collection at large scale for American public library users.   Which is not to say we don’t have large digital book collections– the one I maintain, for instance, has over a million freely readable titles, and Google Books and lots of other smaller digital projects have millions more.  But they function more as research or special-purpose collections than as collections for general public reference, education, or enjoyment.

The big reason for this, of course, is copyright.  In the US, anyone can freely digitize books and other resources published before 1923, but providing anything published after that requires copyright research and, usually, licensing, that tends to be both complex and expensive.  So the tendency of a lot of digital library projects is to focus on the older, obviously free material, and have little current material.  But a generally useful digital public library needs to be different.

And it can be, with the right motivation, strategy, and support.  The key insight is that while a strong digital public library needs to have high-quality, current knowledge resources, it doesn’t need to have all such resources, or even the most popular or commercially successful ones.  It just needs to acquire and maintain a few high-quality resources for each of the significant needs and aptitudes of its audience. Mind you, that’s still a lot of ground to cover, especially when you consider all the ages, education levels, languages, physical and mental abilities, vocational needs, interests, and demographic backgrounds that even a midsized town’s public library serves.  But it’s still a substantially smaller problem, and involves a smaller cost, than the enticing but elusive idea of providing instant free online access to everything for everyone.

There are various ways public digital libraries could acquire suitable materials proactively.  The America.gov books collection provides one interesting example.  The US State Department wanted to create a library of easy-to-read books on civics and American culture and history for an international audience.  Some of these books were created in-house by government staff.  Others were commissioned to outside authors.  Still others were adapted from previously published works, for which the State Department acquired rights.

A public digital library could similarly create, commission, solicit, or acquire rights to books that meet unfilled information needs of its patrons.  Ideally it would aim to acquire rights not just to distribute a work as-is, but also to adapt and remix into new works, as many Creative Commons licenses allow.  This can potentially greatly increase the impact of any given work.  For instance, a compellingly written,  beautifully illustrated book on dinosaurs might be originally written for 9-12 year old English speakers, and be noticeably obsolete due to new discoveries after 5 or 10 years.  But if a library’s community has reuse and adaptation rights, library members can translate, adapt, and update the book, so it becomes useful to a larger audience over a longer period of time.

This sort of collection building can potentially be expensive; indeed, it’s sobering that America.gov has now ceased being updated, due to budget cuts.  But there’s a lot that can be produced relatively inexpensively.  Khan Academy, for example, contains thousands of short, simple educational videos, exercises, and assessments created largely by one person, with the eventual goal of systematically covering the entire standard K-12 curriculum.  While I think a good educational library will require the involvement of many more people, the Khan example shows how much one person can get accomplished with a small budget, and projects like Wikipedia show that there’s plenty of cognitive surplus to go around, that a public library effort might usefully tap into.

Moreover, the markets for rights to previously authored content can potentially be made much more efficient than they are now.  Most books, for instance, go out of print relatively quickly, with little or no commercial exploitation thereafter.  And as others have noted, just trying to get permission to use  a work digitally, even apart from any royalties, can be very expensive and time-consuming.  But new initiatives like Gluejar aim to make it easier to match up people who would be happy to share their book rights with people who want to reuse them. Authors can collect a small fee (which could easily be higher than the residual royalties on an out-of-print book); readers get to share and adapt books that are useful to them.   And that can potentially be much cheaper than acquiring the rights to a new work, or creating one from scratch.

As I’ve described above, then, a digital public library could proactively build an accessible collection of high-quality, up to date online books and other knowledge resources, by finding, soliciting, acquiring, creating, and adapting works in response to the information needs of its users.  It would build up its collection proactively and systematically, while still being opportunistic enough to spot and pursue fruitful new collection possibilities.  Such a digital library could be a very useful supplement to local public libraries, would be open any time anywhere online, and could provide more resources and accessibility options than a local public library could provide on its own.  It would require a lot of people working together to make it work, including bibliographers, public service liaisons, authors, technical developers, and volunteers, both inside and outside existing libraries.  And it would require ongoing support, like other public libraries do, though a library that successfully serves a wide audience could also potentially tap into a wide base of funds and in-kind contributions.

Whether or not the DPLA plans to do it, I think a large-scale digital free public library with a proactively-built, high-quality, broad-audience general collection is something that a civilized society can and should build.  I’d be interested in hearing if others feel the same, or have suggestions, critiques, or alternatives to offer.

April 9, 2011

Opt in for open access

Filed under: copyright,libraries,online books,open access — John Mark Ockerbloom @ 8:40 am

There’s been much discussion online about Judge Chin’s long-awaited decision to reject the settlement proposed by Google and authors and publishers’ organizations over the Google Books service. Settlement discussions continue (and the court has ordered a status conference for April 25).  But it’s clear that it will be a while before this case is fully settled or decided.

Don’t count on a settlement to produce a comprehensive library

When the suit is finally resolved, it will not enable the comprehensive retrospective digital library I had been hoping for.  That, Chin clearly indicated, was an over-reach.  The  proposed settlement would have allowed Google to sell access to most pre-2009 books published in the English-speaking world whose rightsholders had not opted out.   But, as Chin wrote, “the case was about the use of an indexing and searching tool, not the sale of complete copyrighted works.”  The changes in the American copyright regime that the proposed settlement entailed, he wrote, were too sweeping for a court to approve.

Unless Congress makes changes in copyright law, then, a rightsholder has to opt in for a copyrighted book to be made readable on Google (or on another book site).  Chin’s opinion ends with a strong recommendation for the parties to craft a settlement that would largely be based on “opt-in”.  Of course, an “opt in” requirement necessarily excludes orphan works, where one cannot find a rightsholder to opt in.  And as John Wilkin recently pointed out, it’s likely that a lot of the books held by research libraries are orphan works.

Don’t count on authors to step up spontaneously

Chin expects that many authors will naturally want to opt in to make their works widely available, perhaps even without payment.  “Academic authors, almost by definition, are committed to maximizing access to knowledge,” he writes.  Indeed, one of the reasons he gives for rejecting the settlement is the argument, advanced by Pamela Samuelson and some other objectors, that the interests of academic and other non-commercially motivated authors are different from those of the commercial organizations that largely drove the settlement negotiations.

I think that Chin is right that many authors, particularly academics, care more about having their work appreciated by readers than about making money off of it.  And even those who want to maximize their earnings on new releases may prefer freely sharing their out of print books to keeping them locked away, or making a pittance on paywall-mediated access.  But that doesn’t necessarily mean that we’ll see all, or even most, of these works “opted in” to a universally accessible library.  We’ve had plenty of experience with institutional repositories showing us that even when authors are fine in principle with making their work freely available, most will not go out of their way to put their work in open-access repositories, unless there are strong forces mandating or proactively encouraging it.

Don’t count on Congress to solve the problem

The closest analogue to a “mandate” for making older books generally available would be orphan works legislation.    If well crafted, such a law could make a lot of books available to the public that now have no claimants, revenue, or current audience, and I hope that a coalition can come together to get a good law passed. But an orphan works law could take years to adopt (indeed, it’s already been debated for years). There’s no guarantee on how useful or fair the law that eventually gets passed would be, after all the committees and interest groups are done with it.  And even the best law would not cover many books that could go into a universal digital library.

Libraries have what it takes, if they’re proactive

On the other hand, we have an unprecedented opportunity right now to proactively encourage authors (academic or otherwise) to make their works freely available online.  As Google and various other projects continue to scan books from library collections, we now have millions of these authors’ books deposited in “dark” digital archives.  All an interested author has to do is say the word, and the dark  copy can be lit up for open access.  And libraries are uniquely positioned to find and encourage the authors in their communities to do this.

It’s now pretty easy to do, in many cases.  Hathi Trust, a coalition of a growing number of research institutions, currently has over 8 million volumes digitized from member libraries.  Most of the books are currently inaccessible due to copyright.  But they’ve published a permission agreement form that an author or other rightsholder can fill out and send in if they want to make their book freely readable online.  The form could be made a bit clearer and more visible, but it’s workable as it is.  As editor of The Online Books Page, I not infrequently hear from people who want to share their out of print books, or those of their ancestors, with the world.  Previously, I had to worry about how the books would get online.  Now I usually can just verify it’s in Hathi’s collection, and then refer them to the form.

Google Books also lets authors grant access rights through their partner program.  Joining the program is more complicated than sending in the Hathi form, and it’s more oriented towards selling books than sharing them.  But Google Books partners can declare their books freely readable in full if they wish, and can give them Creative Commons licenses (as they can with Hathi).  Google has even more digitized books in its archives than Hathi does.

So, all those who would love to see a wide-ranging (if not entirely comprehensive), globally accessible digital library now have a real opportunity to make it happen.  We don’t have to wait for Congress to act, or  some new utopian digital library to arise.  Thanks to mass digitization, library coalitions like Hathi’s, and the development of simplified, streamlined rights and permissions processes, it’s easier than ever for interested authors (and heirs, and publishers) to make their work freely available online.  If those us involved in libraries, scholarship, and the open access movement work to open up our own books, and those of our colleagues, we can light up access to the large, universal digital library that’s now waiting for us online.

October 29, 2010

September 8, 2010

June 11, 2010

Journal liberation: A primer

Filed under: copyright,libraries,open access,publishing,sharing — John Mark Ockerbloom @ 10:07 am

As Dorothea Salo recently noted, the problem of limited access to high-priced scholarly journals may be reaching a crisis point.  Researchers that are not at a university, or are at a not-so-wealthy one, have long been frustrated by journals that are too expensive for them to read (except via slow and cumbersome inter-library loan, or distant library visits).  Now, major universities are feeling the pain as well, as bad economic news has forced budget cuts in many research libraries, even as further price increases are expected for scholarly journals.  This has forced many libraries to consider dropping even the most prestigious journals, when their prices have risen too high to afford.

Recently, for instance, the University of California, which has been subject to significant budget cuts and furloughssent out a letter in protest of Nature Publishing Group’s proposal to raise their subscription fees by 400%.  The letter raised the possibility of cancelling all university subscriptions to NPG, and having scholars boycott the publisher.

Given that Nature is one of the most prestigious academic journals now publishing, one that has both groundbreaking current articles and a rich history of older articles, these are strong words.  But dropping subscriptions to journals like Nature might not be as as much of a hardship for readers as it once might have been.  Increasingly, it’s possible to liberate the research content of academic journals, both new and old, for the world.  And, as I’ll explain below, now may be an especially opportune time to do that.

Liberating new content

While some of the content of journals like Nature is produced by the journal’s editorial staff or other writers for hire, the research papers are typically written by outside researchers, employed by universities and other research institutions.  These researchers hold the original copyright to their articles, and even if they sign an agreement with a journal to hand over rights to them (as they commonly do), they retain whatever rights they don’t sign over.  For many journals, including the ones published by Nature Publishing Group, researchers retain the right to post the accepted version of their paper (known as a “preprint”) in local repositories.  (According to the Romeo database, they can also eventually post the “postprint”– the final draft resulting after peer review, but before actual publication in the journal– under certain conditions.)  These drafts aren’t necessarily identical to the version of record published in the journal itself, but they usually contain the same essential information.

So if you, as a reader, find a reference to a Nature paper that you can’t access, you can search to see if the authors have placed a free copy in an open access repository. If they haven’t, you can contact one of them to encourage them do do so.  To find out more about providing open access to research papers, see this guide.

If a journal’s normal policies don’t allow authors to share their work freely in an open access repository, authors  may still be able to retain their rights with a contract addendum or negotiation.  When that hasn’t worked, some academics have decided to publish in, or review for, other journals, as the California letter suggests.  (When pushed too far, some professors have even resigned en masse from editorial boards to start new journals that are friendlier to authors and readers.

If nothing else, scholarly and copyright conventions generally respect the right of authors to send individual copies of their papers to colleagues that request them.  Some repository software includes features that make such copies extremely easy to request and send out.  So even if you can’t find a free copy of a paper online already, you can often get one if you ask an author for it.

Liberating historic content

Many journals, including Nature, are important not only for their current papers, but for the historic record of past research contained in their back issues.  Those issues may be difficult to get a hold of, especially as many libraries drop print subscriptions, deaccession old journal volumes, or place them in remote storage.  And electronic access to old content, when it’s available at all, can be surprisingly expensive.  For instance, if I want to read this 3-paragraph letter to the editor from 1872 on Nature‘s web site, and I’m not signed in at a subscribing institution, the publisher asks me to pay them $32 to read it in full.

Fortunately, sufficiently old journals are in the public domain, and digitization projects are increasingly making them available for free.  At this point, nearly all volumes of Nature published before 1922 can now be read freely online, thanks to scans made available to the public by the University of Wisconsin, Google, and Hathi Trust.  I can therefore read the letters from that 1872 issue, on this page, without having to pay $32.

Mass digitization projects typically stop providing public access to content published after 1922, because copyright renewals after that year might still be in force.  However, most scholarly journals– including, as it turns out, Nature — did not file copyright renewals.  Because of this, Nature issues are actually in the public domain in the US all the way through 1963 (after which copyright renewal became automatic).  By researching copyrights for journals, we can potentially liberate lots of scholarly content that would otherwise be inaccessible to many. You can read more about journal non-renewal in this presentation, and research copyright renewals via this site.

Those knowledgeable about copyright renewal requirements may worry that the renewal requirement doesn’t apply to Nature, since it originates in the UK, and renewal requirements currently only apply to material that was published in the US before, or around the same time as, it was published abroad.  However, offering to distribute copies in the US counts as US publication for the purposes of copyright law.  Nature did just that when they offered foreign subscriptions to journal issues and sent them to the US; and as one can see from the stamp of receipt on this page, American universities were receiving copies within 30 days of the issue date, which is soon enough to retain the US renewal requirement.  Using similar evidence, one can establish US renewal requirements for many other journals originating in other countries.

Minding the gap

This still leaves a potential gap between the end of the public domain period and the present.  That gap is only going to grow wider over time, as copyright extensions continue to freeze the growth of the public domain in the US.

But the gap is not yet insurmountable, particularly for journals that are public domain into the 1960s.  If a paper published in 1964 included an author who was a graduate student or a young researcher, that author may well be still alive (and maybe even be still working) today, 46 years later.  It’s not too late to try to track authors down (or their immediate heirs), and encourage and help them to liberate their old work.

Moreover, even if those authors signed away all their rights to journal publishers long ago, or don’t remember if they still have any rights over their own work, they (or their heirs) may have an opportunity to reclaim their rights.  For some journal contributions between 1964 and 1977, copyright may have reverted to authors (or their heirs) at the time of copyright renewal, 28 years after initial publication.  In other cases, authors or heirs can reclaim rights assigned to others, using a termination of transfer.  Once authors regain their rights over their articles, they are free to do whatever they like with them, including making them freely available.

The rules for reversion of author’s rights are rather arcane, and I won’t attempt to explain them all here.  Terminations of transfer, though, involve various time windows when authors have the chance to give notice of termination, and reclaim their rights.  Some of the relevant windows are open right now.   In particular, if I’ve done the math correctly, 2010 marks the first year one can give notice to terminate the transfer of a paper copyrighted in 1964, the earliest year in which most journal papers are still under US copyright.  (The actual termination of a 1964 copyright’s transfer won’t take effect for another 10 years, though.)  There’s another window open now for copyright transfers from 1978 to 1985; some of those terminations can take effect as early as 2013.  In the future, additional years will become available for author recovery of copyrights assigned to someone else.  To find out more about taking back rights you, or researchers you know, may have signed away decades ago, see this tool from Creative Commons.

Recognizing opportunity

To sum up, we have opportunities now to liberate scholarly research over the full course of scholarly history, if we act quickly and decisively.  New research can be made freely available through open access repositories and journals.  Older research can be made freely available by establishing its public domain status, and making digitizations freely available.  And much of the research in the not-so-distant past, still subject to copyright, can be made freely available by looking back through publication lists, tracking down researchers and rights information, and where appropriate reclaiming rights previously assigned to journals.

Journal publishing plays an important role in the certification, dissemination, and preservation of scholarly information.  The research content of journals, however, is ultimately the product of scholars themselves, for the benefit of scholars and other knowledge seekers everywhere.   However the current dispute is ultimately resolved between Nature Publishing Group and the University of California, we would do well to remember the opportunities we have to liberate journal content for all.

March 23, 2010

Lots of conversation keeps stuff sustainable

Filed under: libraries,people,preservation,sharing — John Mark Ockerbloom @ 10:12 pm

Among the hats I wear at my place of work is that of LOCKSS cache administrator. LOCKSS is a useful distributed preservation system built around the principle “Lots of copies keep stuff safe” (whose initials give the system its name).  The idea is that, with the cooperation of publishers, a bunch of libraries each harvest copies of selected online content, and keep backups on our own LOCKSS caches, which are hooked up to local library proxy services.  Then, if the material ever becomes inaccessible from the publisher, our users will automatically be routed to our local copies.  Each LOCKSS cache also periodically checks with other LOCKSS caches to ensure that our copies are still in good shape, and to repair or replace copies that have been lost or damaged.  (Various security features protect against leaks of restricted content, or unauthorized revisions of content.)

LOCKSS is open source software that runs on commodity hardware.  It was originally envisioned to run virtually automatically.  As Chris Dobson described the ideal in a 2003 Searcher article, “Take a computer a generation past its prime…. Hook it up to the Internet and put it in a closet. Stick in the LOCKSS CD-ROM and boot it up. Close the closet door.”  And then presumably walk away and forget about it.

Of course, it’s not that simple in practice, particularly if your library is proactive about its preservation strategy.  The thing about preservation at scale is there’s always something that needs attention.  It might be something technical, or content-related, or planning-related, but preserving a growing collection requires ongoing thought.  And if you want to think as clearly and sensibly as you can, you’ll want to collaborate.

Right now, for instance, I’m trying to get my cache to harvest the full run of a journal that’s just been made available for LOCKSS harvesting, where we hope to provide post-cancellation access through LOCKSS.  Someone at Stanford just gave me a useful tip on how to give this journal priority over the other volumes I’ve got queued up for harvest.  Unfortunately, I can’t try it out until I get my cache back up after it failed to reboot cleanly after a power failure. While I wait to hear back instructions about how best to remedy this, I wonder whether switching to a new Linux-based version of LOCKSS might make such operating system-level problems easier to deal with.  But it would be useful to hear from folks who are running that version to see what their experience has been.

Meanwhile, we’re wondering how best to approach new publishers who have content that our bibliographers would like to preserve via LOCKSS. Our special collections folks wonder whether we should preserve some of our own home-grown content via a private LOCKSS network.  I’m also doing some ongoing monitoring and testing of our LOCKSS cache’s behavior (some of which I’ve reported on earlier), and would be interested in knowing if others are seeing some of the same kinds of things that I see on the cache I administer.

In short, there are a lot of things to think about, when LOCKSS plays a significant role in a preservation plan.  And a lot of the issues I’ve mentioned above are ones that others may be thinking about as well.  So let’s talk about them.  As the LOCKSS group has said, “”A vibrant, active, and engaged user community is key to the success of Open-Source efforts like LOCKSS.”

One thing you need for such an engaged community is a forum for them to talk to each other.  As it turns out, the LOCKSS group at Stanford tell me they created a LOCKSS Forum mailing list a while back, but I haven’t yet seen it publicized.   Its information page is at https://mailman.stanford.edu/mailman/listinfo/lockss-forum .  (Currently, archived email messages are not visible on the open web, though this may change in the future.)  If you’re interested in talking with others about how you use or might use LOCKSS to preserve access to digital content, I invite you to sign up and help get the conversation going.

December 10, 2009

December 4, 2009

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.