Everybody's Libraries

January 1, 2012

Public Domain Day 2012: Five things we can do in the US

Filed under: copyright,libraries,online books,open access — John Mark Ockerbloom @ 10:24 am

It’s New Year’s Day again, and in much of the world, this means another year’s worth of works enter the public domain.  That’s a cause for celebration, as Europe and many other countries that have “life+70 years” copyright terms welcome works by James Joyce, Virginia Woolf, Jelly Roll Morton, and Elizabeth von Arnim into the public domain.  The Communia Project’s Public Domain Day website focuses on works by these and many other authors that are entering (in many cases, re-entering) the public domain in “life+70 years” countries.  Meanwhie, folks in Canada, New Zealand, and other countries that have held the line at the “life+50 years” terms of the Berne Convention can now freely enjoy the works of people like James Thurber, Ernest Hemingway, and H.D.

There’s not so much excitement about Public Domain Day in the US, where no published works are scheduled to enter the public domain for another 7 years, due to a 20-year copyright extension enacted in 1998.  But Americans don’t have to simply sigh and contemplate what might have been if our copyright terms hadn’t been extended.  The new year still provides a number of important opportunities for Americans to improve access to the public domain.

1. Find and free newly public domain unpublished works

Some works are going into the public domain in the US today: works never published prior to 2003 (or copyrighted under US law prior to 1978) by authors who died in 1941– the same authors whose published works go into the public domain in Europe today.

But who would care about such obscure works? one might ask.  Well, if you’re at all interested in understanding the dense, allusion-laden fiction of Joyce, or the psychology of Woolf, or the jurisprudential thinking of Louis Brandeis, or the inner lives of any of the rest of the “class of 1941″, having the right to freely access, publish, and build on their unpublished works can be crucial.

Up until now, for instance, scholars studying James Joyce have often been frustrated by sharp restrictions and legal threats made by the administrator of Joyce’s literary estate.  In 2008, Rebecca Ganz characterized the administrator thus: “[His] primary purpose is to quell any scholarship that he finds distasteful or an invasion of his family’s privacy. He has a history of harassing authors and artists until they buckle under the strain of trying to obtain legal rights to quote from the late author’s writings.”  Scholars wishing to invoke Joyce’s unpublished works in their work have either had to undertake multi-year legal battles, or cut back on the lines of inquiry they might otherwise pursue.

American libraries and archives have many illuminating papers by authors who died in 1941– even non-US authors like Joyce and Woolf.  US digitizers, librarians, and archivists can open up and publicize these works.   In some cases, we’re uniquely positioned to do so, since their unpublished works may still be under copyright in some other countries.

2. Increase worldwide availability of public domain works

Many of the millions of digitized books on the Internet are hosted in the US, in large-scale repositories like Google Books, HathiTrust, and the Internet Archive.  Many of these services give limited access to non-US readers or materials.  Google and HathiTrust, for instance, limit non-US access by default to books published as long as 140 years ago, to avoid falling afoul of “life+70 years” copyright terms abroad.  JSTOR likewise limits access to non-US journal volumes published in 1870 or later.

With another year’s worth of copyrights expiring in “life+70 years” countries, it should be safe for these US-based services to also open up worldwide access to another year’s worth of works, further freeing up the public domain.  HathiTrust is also willing to manually review copyrights on specific books to open up access.  If you come across any books in HathiTrust solely by authors who died in 1941 (or before) that are currently labeled only as “public domain in the United States”, you can request that they review it for opening up access worldwide.  Just use the “Feedback” button at the bottom of the book’s HathiTrust page, or the suggestion form on my Online Books Page; and make sure you ask specifically for non-US access.

3. Restore access to obscure copyrighted works from 1936 (and earlier)

After libraries and archives expressed concerns about the fate of obscure works under longer copyright terms, Congress included a special exemption in their 1998 copyright tem extension.  The exemption, codified as section 108(h) of the copyright law, states that “during the last 20 years of any term of copyright of a published work, a library or archives, including a nonprofit educational institution that functions as such, may reproduce, distribute, display, or perform in facsimile or digital form a copy or phonorecord of such work, or portions thereof, for purposes of preservation, scholarship, or research”, under certain conditions.  In particular, if the institution finds, after a reasonable investigation, that such a work is not “subject to normal commercial exploitation” (such as by being in print) and cannot “be obtained at a reasonable price”, and no rightsholder has filed a claim otherwise, the work qualifies for this special exemption.  As of this year’s Public Domain Day, qualifying publications from 1936 join what is now 14 years of works in this category.

So far, I have found very little digitized content online where this exemption is explicitly invoked.  (There are advantages to explicitly doing so, both because it helps clarify the right to use the material, and helps prevent inadvertent unauthorized propagation of the works, such as the commercial reprints of digitized books that are now common on many large bookselling sites.)  Yet many of the works in HathiTrust’s (currently suspended) orphan works initiative, and in the Internet Archive’s lending library, and more besides, could well qualify for this treatment– and unlike orphan works, where legislation has yet to be passed, the exemption for these materials is already explicitly authorized by statute.

Providing online access for these works is not without controversy.  A 2002 article by lawyer Mary Minow details some of the potential possibilities and risks.   While she concludes that libraries can put such works on the Web, the recent Author’s Guild complaint in its lawsuit against HathiTrust includes some push-back against this idea. But as the public domain in the US recedes further into history, and digital library projects increasingly look for ways to make our cultural heritage available online, American libraries would do well to proactively establish and exercise these rights for older works now languishing in obscurity.

4. Strengthen and sustain coalitions for reasonable copyright limits

The curtailment of the public domain is just one aspect of the overreach of copyright law in the US and elsewhere.  Right now, Congress is considering two bills, the Stop Online Piracy Act (SOPA) and the PROTECT IP Act (PIPA), whose enforcement provisions threaten to disrupt the core structures of the Internet and enable far-reaching censorship, in the name of stopping piracy.  Supporters of these bills hoped to have them passed by Christmas, but opposition from both “left” and “right” sides of the political spectrum has slowed the process down, caused some companies to withdraw support, and led to the proposal of less harmful alternatives for fighting piracy.

It’s still quite possible that SOPA and PIPA will pass, though.   Public Domain Day provides an opportunity for Americans to reflect on some of the good reasons for limiting the power and scope of copyright enforcement, and to redouble efforts to keep those limits reasonable.  Moreover, a coalition that can stop SOPA and PIPA can also work to prevent further extensions of copyright terms.  This can ensure that Americans will have more to celebrate in Public Domain Days to come– especially starting in 2019, when the remaining 1923 copyrights should finally expire in the US.

5. Give copyrights of your own to the public domain

Of course, those wishing to maximize public access and use of their works don’t have to wait for their copyrights to expire on their own.  They can dedicate them to the public domain any time they want.  Public Domain Day is a particularly auspicious time to make such gifts, no matter what country you’re in.  And with tools like the CC0 declaration, it’s easier than ever to do so.

A few years ago, I started an annual personal tradition of reviewing copyrights to works I’d created more than 14 years ago (the original initial term of copyright enacted by the founders of the US, and also approximately the ideal copyright term given in a recent economic analysis) and dedicating works to the public domain that I didn’t feel needed further copyright.  Accordingly, today I dedicate all the work of my creation that I published in 1997, and for which I still control rights, to the public domain.  For me, this consists primarily of websites like The Online Books Page as of that year, and other online writings.  But others have dedicated more high-profile material to the public domain after the same term.   And I’d be very happy to hear from others who are making similar dedications today (whether or not it’s after 14 years).

So, happy Public Domain Day to everyone in the US and elsewhere!  We all have things to celebrate, and things we can do, in the name of the public domain.

October 7, 2011

My mother’s orphan

Filed under: copyright,findingada,online books,open access,people,preservation,sharing,teaching — John Mark Ockerbloom @ 5:06 pm

Before my mother was pregnant with me, she was working on a book.

The book had begun its gestation at least a year before. She had been teaching math in Massachusetts, and was involved with the Madison Project, one of the initiatives that arose from the “new math” movement of the 1960s.  What excited her, and what I caught from her not long after I was born, was the sense of discovery and play that was encouraged in the Madison teaching style.  The primary focus wasn’t so much on imparting and drilling facts and rules, or on mundane applications, but on finding patterns, solving puzzles, and figuring out the secrets of numbers and geometry and the other mathematical constructs that underlie our world. Some project participants planned a series of books that would help bring out this sense of discovery and exploration in math classes.

Two small children in the house may have delayed my mother’s ambitions, but we didn’t stop her.  When I was in kindergarten, the piles of papers in my parents’ bedroom went away, and my mother proudly showed me her new book.  The book, Discoveries in Essential Mathematics, was co-written with Ramon Steinen, and published by Charles E. Merrill. Though the textbook was written for middle schoolers, I remember reading through the book after my mother showed it to me, solving the simpler problems, and smiling when I saw my name or my sister’s in an example.

She got small royalty checks for a few years, but the book was out of print by the late 1970s, never reaching a second edition.  We kept some copies in our basement, but I didn’t know of any library that held it.  When I visited the Library of Congress as a middle schooler, wrongly convinced that they had every book ever published, I remember my disappointment when I couldn’t find Mom’s book in their card catalog.

My mother eventually retired from teaching, and the enthusiasm and talent I’d gotten from Mom for math shifted into computing, and then into digital libraries.  And when my kids reached school age, I decided to try putting her book online.  In an era of large classes, detailed state standards, and high-stakes standardized tests, it might not be a viable standard textbook any more, but I think it’s still great for curious kids who show an interest in math.

Mom thought that was a great idea.  But she didn’t know if she could grant permission on her own.  Although long out of print, the book’s copyright had automatically renewed in 2000 under US copyright law, and she wasn’t sure if she had to get the consent of her publisher or co-author before she could give me the go-ahead. She didn’t know how to reach her co-author, and her old imprint was long gone.  Even its acquirer had itself been acquired by a large conglomerate some time ago.  So I let the idea drop, thinking I’d come back to it later when I had a little time to research the copyright.

But not long after, she started a long slide into dementia, and was soon in no position to give permission to anyone.  If her book had been practically an “orphan work” before, due to uncertainty over rights, it was even more so now.  There was no trouble locating the author; but no way of getting valid permission from someone definitely known to hold the rights.

Mom died this past winter, four years after my Dad had reluctantly moved her into the nursing home for good, and four weeks after he’d made his usual daily visit, gone back home, and had a fatal heart attack.  After we paid the last of the bills, and threw out the contents of the basement (where a burst pipe ruined all the books, papers, and other things they kept down there), what remained of what they had would now go to me and my siblings.

I still had a copy at home of the teacher’s edition of Mom’s book that she had once given to Grandma.  And between my mother’s funeral and the burst pipe, I’d taken a student edition out of their basement for my kids to read.  But any faint hope of finding publishing contracts or rights assignment documents was obliterated after the pipe burst.  The basic questions were: had Mom signed her rights to the book away, as many academic authors do? If so, had she gotten them back at some point?  Or had she never had the rights in the first place, as sometimes happens with textbook authors under “work for hire” contracts?

The copyright page of the book, and the record in the 1972 Catalog of Copyright Entries, show the publisher as the copyright claimant, so I couldn’t assume she had the rights.   But I also doubted whether I could get a clear answer, or reasonable licensing terms, from the company that had eventually acquired the assets of Mom’s original publisher.

I eventually found what I needed to know on a trip to Washington, DC.  While attending a meeting on digital format registries, I realized that I was in the same building as the Copyright Office.   So after the meeting, I got a reader’s card, went upstairs, and consulted the librarians there.  We confirmed that, under the automatic renewal laws of the time, the copyright to Mom’s book would have reverted in 2000 to whoever had been declared the “author” in the book in the original registration record.   Moreover, in the absence of any contrary arrangement, any co-owner of a copyright can authorize publication, as long as they split any proceeds with the other copyright owners.

Since I was planning just to put the book online for free, the only question remaining was: who was listed as the author on the original registration: the publisher who claimed the copyright, or my mother and Dr. Steinen?  It’s not clear from the Catalog of Copyright Entries, but the original registration certificate would state it.  And the one copy known to exist of that certificate was in the archives of the Copyright Office where I was sitting.

Twenty minutes later, I had the certificate in front of me.  The name on the “claimant” line was indeed the publisher’s, but the names on the “author” line were Steinen and Ockerbloom.  My mother’s orphan was mine to claim.

There are a lot more books out there like hers.  Since I added records for Hathi Trust‘s public domain books to The Online Books Page, I’ve gotten requests to curate hundreds of out of print, largely forgotten books that are still meaningful to readers online.  Many of the people who opt to leave contact information  live in places where  books tend to be hard to get or pay for. Many others, judging from their names, seem to be related to the authors of the books they suggest. These readers have found the books after Hathi, or Google, or the Internet Archive, has resurfaced them online, and the readers want these books to live on.  If there were an easy, inexpensive, uncontroversially legal way to also bring back books that are still in copyright, but no longer commercially exploited, I’m sure I could fulfill a lot of requests for those books too.

For now, though, I’ll bring back the one orphan book I’ve been given. And I thank my mother for writing it, and the other women and men who have poured so much of their energy and teaching into their books, and the librarians of all kinds who help ensure those books stay accessible to readers who value them.  I’ll try my best to keep your legacies alive.

September 23, 2011

Early journals from JSTOR and others

Filed under: copyright,open access,serials,sharing — John Mark Ockerbloom @ 11:26 am

Earlier this month,  JSTOR announced that it would provide  free open access to their earliest scholarly journal content, published before 1923.  All of this material should be old enough to be in the public domain.  (Or at least it is in the US.  Since copyrights can last longer elsewhere, JSTOR is only showing pre-1870 volumes openly outside the US.)  I was very pleased to hear they would be opening up this content; it’s something I’d asked them to consider ever since they ended a small trial of open, public domain volumes in their early years.

Lots of early  journal content now openly readable online

The time was ripe to open access at JSTOR.  (And not just because of growing discontent over limited access to public domain and publicly funded research.) Thanks to mass-digitization initiatives and other projects, much of the early journal content found in JSTOR is now also available from other sources.  For instance, after Gregory Maxwell posted a torrent of pre-1923 JSTOR volumes of the Philosophical Transactions of the Royal Society of London, I surveyed various free digital text sites and found nearly all the same volumes, and more, available for free from Hathi Trust, Google, the Internet Archive, Gallica, PubMed Central, and the Royal Society itself.  The content needed to be organized to be usefully browsable across sites, but that required a bit of basic librarianship and a bit of time.

Philosophical Transactions is not an anomaly.  After collating volumes of this journal, I looked at the first ten journals that signed on to JSTOR back in the mid-1990s.  (The list can be found below.)  I again found that nearly all of pre-1923 content of these journals was also available from various free online sites.  Now, when you look them up on The Online Books Page, you’ll find links to both the JSTOR copies and the copies at other sites.

Comparing the sites that provide this content is enlightening.  In general, the JSTOR copies are better presented,  with article-level tables of contents, cross-volume searching, article downloads, and consistently high scan quality.  But the copies at other sites are generally usable as well, and sometimes include interesting non-editorial material, such as advertisements, that might not be present in JSTOR’s archive.  By opening up access to its early content now, though, JSTOR will remain the preferred access point to this early content for most researchers — and that, hopefully, will help attract and sustain paid support for the larger body of scholarly content that JSTOR provides and preserves for its subscribers.

And there’s a lot more in the public domain

JSTOR currently only provides open access for volumes up to 1922 (or up to 1869, if you’re not in the US).   But there’s lots more public domain journal content that can be made available.  Looking again at the initial ten JSTOR journals, I found that all of them have additional public domain content that is currently not available as open access on JSTOR, or as of yet on other sites.  That’s because journals published in the US before 1964 had to renew their copyrights after 28 years or enter the public domain.  But most scholarly journals, including these 10, did not renew the copyrights to all their issues.  Here’s a list of the 10 journals, and their first issue copyright renewals:

  1. The American Historical Review – began 1895; issues first renewed in 1931
  2. Econometrica - began 1933; issues first renewed in 1942
  3. The American Economic Review – began 1911; issues not renewed before 1964 (when renewal became automatic)
  4. Journal of Political Economy – began 1892; issues first renewed in 1953
  5. Journal of Modern History - began 1929, issues first renewed in 1953
  6. The William and Mary Quarterly – began 1892; issues first renewed in 1946
  7. The Quarterly Journal of Economics – began 1886; issues first renewed in 1934
  8. The Mississippi Valley Historical Review (now the Journal of American History) – began 1914; issues first renewed in 1939
  9. Speculum – began 1926; issues first renewed in 1934
  10. Review of Economic Statistics (now the Review of Economics and Statistics) – began 1919; issues first renewed in 1935

This list reflects more proactive renewal policies than were typical for scholarly journals. A few years ago, I did a survey of JSTOR journals (summarized in this presentation) that were publishing between 1923 and 1950, and found that only 49 out of 298, or about 1/6, renewed any of their issue copyrights for that time period.  (JSTOR has since added more journals covering this time period, so the numbers will be different now, but I suspect the renewal rate won’t be any higher now than it was then.)

Currently JSTOR has no plans to open up access to post-1922 journal volumes.  But many of those volumes have been digitized, and are in Google’s or Hathi Trust’s collections; or they could be digitized by contributors to the Internet Archive or similar text archives.

If someone does want to open up these volumes, they should re-check their copyright status.   In particular, I have not yet checked the copyright status of individual articles in these journals, which can in theory be renewed separately.  In practice, I’ve found this rarely done for scholarly articles, but not completely unknown.  It might be feasible for me to do a “first article renewal” inventory for journals, like I’ve done for first issue renewal, which could speed up clearances.

Opportunities for open librarianship

JSTOR’s recent open access release of early journals, then, is just the beginning of the open access historic journal content that can be available online.  JSTOR provides a valuable service to libraries in providing and preserving comprehensive digital back runs of major scholarly journals, both public domain and copyrighted.  But while our libraries pay for that service, let’s also remember our mission to provide access to knowledge for all whenever possible.  JSTOR’s contribution in opening  its pre-1923 journal volumes is a much-appreciated contribution to a high-quality open record of early scholarship.  We can build on that further, with copyright research, digitization, and some basic public librarianship.  (I’ve discussed the basics of journal liberation in previous posts.)

For my part, I plan to start by gradually incorporating the open access JSTOR offerings into the serial listings of the Online Books Page, as time permits.  I can also gather further copyright information on these and other journals as I bring them in.  I’m also happy to hear about more journals that are or can go online (whether they’re JSTOR journals or not); you can submit them via my suggestion interface.

How about you?  What would you like to see from the early scholarly record, and what can you do to help open it up?

May 24, 2011

April 9, 2011

Opt in for open access

Filed under: copyright,libraries,online books,open access — John Mark Ockerbloom @ 8:40 am

There’s been much discussion online about Judge Chin’s long-awaited decision to reject the settlement proposed by Google and authors and publishers’ organizations over the Google Books service. Settlement discussions continue (and the court has ordered a status conference for April 25).  But it’s clear that it will be a while before this case is fully settled or decided.

Don’t count on a settlement to produce a comprehensive library

When the suit is finally resolved, it will not enable the comprehensive retrospective digital library I had been hoping for.  That, Chin clearly indicated, was an over-reach.  The  proposed settlement would have allowed Google to sell access to most pre-2009 books published in the English-speaking world whose rightsholders had not opted out.   But, as Chin wrote, “the case was about the use of an indexing and searching tool, not the sale of complete copyrighted works.”  The changes in the American copyright regime that the proposed settlement entailed, he wrote, were too sweeping for a court to approve.

Unless Congress makes changes in copyright law, then, a rightsholder has to opt in for a copyrighted book to be made readable on Google (or on another book site).  Chin’s opinion ends with a strong recommendation for the parties to craft a settlement that would largely be based on “opt-in”.  Of course, an “opt in” requirement necessarily excludes orphan works, where one cannot find a rightsholder to opt in.  And as John Wilkin recently pointed out, it’s likely that a lot of the books held by research libraries are orphan works.

Don’t count on authors to step up spontaneously

Chin expects that many authors will naturally want to opt in to make their works widely available, perhaps even without payment.  “Academic authors, almost by definition, are committed to maximizing access to knowledge,” he writes.  Indeed, one of the reasons he gives for rejecting the settlement is the argument, advanced by Pamela Samuelson and some other objectors, that the interests of academic and other non-commercially motivated authors are different from those of the commercial organizations that largely drove the settlement negotiations.

I think that Chin is right that many authors, particularly academics, care more about having their work appreciated by readers than about making money off of it.  And even those who want to maximize their earnings on new releases may prefer freely sharing their out of print books to keeping them locked away, or making a pittance on paywall-mediated access.  But that doesn’t necessarily mean that we’ll see all, or even most, of these works “opted in” to a universally accessible library.  We’ve had plenty of experience with institutional repositories showing us that even when authors are fine in principle with making their work freely available, most will not go out of their way to put their work in open-access repositories, unless there are strong forces mandating or proactively encouraging it.

Don’t count on Congress to solve the problem

The closest analogue to a “mandate” for making older books generally available would be orphan works legislation.    If well crafted, such a law could make a lot of books available to the public that now have no claimants, revenue, or current audience, and I hope that a coalition can come together to get a good law passed. But an orphan works law could take years to adopt (indeed, it’s already been debated for years). There’s no guarantee on how useful or fair the law that eventually gets passed would be, after all the committees and interest groups are done with it.  And even the best law would not cover many books that could go into a universal digital library.

Libraries have what it takes, if they’re proactive

On the other hand, we have an unprecedented opportunity right now to proactively encourage authors (academic or otherwise) to make their works freely available online.  As Google and various other projects continue to scan books from library collections, we now have millions of these authors’ books deposited in “dark” digital archives.  All an interested author has to do is say the word, and the dark  copy can be lit up for open access.  And libraries are uniquely positioned to find and encourage the authors in their communities to do this.

It’s now pretty easy to do, in many cases.  Hathi Trust, a coalition of a growing number of research institutions, currently has over 8 million volumes digitized from member libraries.  Most of the books are currently inaccessible due to copyright.  But they’ve published a permission agreement form that an author or other rightsholder can fill out and send in if they want to make their book freely readable online.  The form could be made a bit clearer and more visible, but it’s workable as it is.  As editor of The Online Books Page, I not infrequently hear from people who want to share their out of print books, or those of their ancestors, with the world.  Previously, I had to worry about how the books would get online.  Now I usually can just verify it’s in Hathi’s collection, and then refer them to the form.

Google Books also lets authors grant access rights through their partner program.  Joining the program is more complicated than sending in the Hathi form, and it’s more oriented towards selling books than sharing them.  But Google Books partners can declare their books freely readable in full if they wish, and can give them Creative Commons licenses (as they can with Hathi).  Google has even more digitized books in its archives than Hathi does.

So, all those who would love to see a wide-ranging (if not entirely comprehensive), globally accessible digital library now have a real opportunity to make it happen.  We don’t have to wait for Congress to act, or  some new utopian digital library to arise.  Thanks to mass digitization, library coalitions like Hathi’s, and the development of simplified, streamlined rights and permissions processes, it’s easier than ever for interested authors (and heirs, and publishers) to make their work freely available online.  If those us involved in libraries, scholarship, and the open access movement work to open up our own books, and those of our colleagues, we can light up access to the large, universal digital library that’s now waiting for us online.

November 11, 2010

You do the math

Filed under: open access,publishing,serials,sharing — John Mark Ockerbloom @ 6:02 pm

I recently heard from Peter Murray-Rust that the Central European Journal of Mathematics (CEJM) is looking for graduate students to edit the language of papers they publish.  CEJM is co-published by Versita and Springer Science+Business Media.

Would-be editors are promised their name on the masthead, and references and recommendations from the folks who run the journal.  These perks are tempting to a student (or postdoc) hoping for stable employment, but you can get such benefits working with just about any scholarly journal.  There’s no mention of actual pay for any of this editing work.  (Nor is there any pay for the associate editors they also seek, though those editors are also promised access to the journal’s content.)

The reader’s side of things looks rather different, when it comes to paying. If we look at Springer’s price lists for 2011, for instance, we see that the list price for a 1-year institutional subscription to CEJM is $1401 US for “print and free access or e-only”, or $1681 US for “enhanced access”.  An additional $42 is assessed for postage and handling, presumably waived if you only get the electronic version, but charged otherwise.

This is a high subscription rate even by the standards of commercial math journals.  At universities like mine, scholars don’t pay for the journal directly, but the money the library uses for the subscription is money that can’t be used to buy monographs, or to buy non-Springer journals, or to improve library service to our mathematics scholars.  Mind you, many universities get this journal as part of a larger package deal with Springer.  This typically lowers the price for each journal, but the package often includes a number of lower-interest journals that wouldn’t otherwise be bought.  Large amounts of money are tied up in these “big deals” with large for-profit publishers such as Springer.

If you can’t, or won’t, lay out the money for a subscription or larger package, readers can pay for articles one at a time.  When I tried to look at a recent CEJM article from home, for instance, I was asked to pay $34 before I could read it.  Another option is author-paid open access.  CEJM authors who want to make their papers available through the journal without a paywall can do so through Springer’s Open Choice program.  This will cost the author $3000 US.

So there’s plenty of money involved in this journal.  It’s just that none of it goes to the editors they’re seeking.  Or to the authors of the papers, who submit them for free (or with a $3000 payment).  Or to the peer reviewers of the papers, if this journal works like most other scholarly journals and uses volunteer scholars as referees.  A scholar might justifiably wonder all this money is going, or what value they get in return for it.

As the editor job ads imply, much of what scholars get out of editing and publishing in journals like these is recognition and prestige.  That, indeed, has value, but the cost-value function can be optimized much better than in this case.  CEJM’s website mentions that it’s tracked by major citation services, and has a 0.361 impact factor (a number often used, despite some notable problems, to give a general sense of a journal’s prestige).  Looking through the mathematics section of the Directory of Open Access Journals, I find a number of scholarly journals that are also tracked by citation services, but don’t charge anything to readers, and as far as I can tell don’t charge anything to authors either.   Here are some of them:

Central Europe, besides being the home of CEJM, is also the home of several open access math journals such as Documenta Mathematica (Germany), the Balkan Journal of Geometry and its Applications (Romania), and the Electronic Journal of Qualitative Theory of Differential Equations (Hungary).  For what it’s worth, all of these journals, and all the other open access journals mentioned in this post, currently show higher impact factors in Journal Citation Reports than CEJM does.

Free math journals aren’t limited to central Europe.  Here in the US, the American Mathematical Society makes the Bulletin of the American Mathematical Society free to read online, through the generosity of its members.  And on the campus where I work, Penn’s math department sponsors the Electronic Journal of Combinatorics.

A number of other universities also sponsor open-access journals, promoting their programs, and the findings of scholars worldwide, with low overhead.  For instance, there are two relatively high-impact math journals from Japanese universities: the Kyushu Journal of Mathematics and the Osaka Journal of Mathematics.  The latter journal’s online presence is provided by Project Euclid, a US-based initiative to support low-cost, non-profit mathematics publishing.

Ad-hoc groups of scholars can also organize their own open access journals in their favored specialty.  For instance, Homology, Homotopy and Applications is founded and entirely run by working mathematicians.  Some journals, such as the open access Discrete Mathematics and Theoretical Computer Science, use Open Journal Systems, a free open source publishing software package, to produce high-quality journal websites with little expenditure.

The Proceedings of the Indian Academy of Sciences: Mathematical Sciences is an interesting case.  Like many scholarly societies, the Indian Academy has recently made a deal with a for-profit publisher (Springer, as it turns out) to distribute their journals in print and electronic form.  Unlike many such societies, though, the Academy committed to continuing a free online version of this journal on their own website.

This is a fortunate decision for readers, because libraries that acquire the commercially published version will have to pay Springer $280 per year for basic access and $336 for “enhanced access”, according to their 2011 price list.  True, libraries get a print copy with this more expensive access (if they’re willing to pay Springer another $35 in postage and handling charges).  But the Academy sends out print editions within India for a total subscription price (postage included) of 320 rupees per year.   At today’s exchange rates, that’s less than $8 US.

Virtually all journals, whether in mathematics or other scholarly fields, depend heavily on unpaid academic labor for the authorship, refereeing, and in some cases editing of their content.  But, as you can see with CEJM and the no-fee open access journals mentioned above, journals vary widely in the amount of money they also extract from the academic community.  In between these two poles, there are also lots of other high-impact math journals with lower subscription prices, as well as commercial open access math journals with much lower author fees than Springer’s Open Choice.  These journals further diversify the channels of communication among mathematicians, without draining as much of  their funds.

I certainly hope mathematicians and other scholars will continue to volunteer their time and talents to the publication process, both for their benefit and for ours.  But if we optimize where and how we give our time and talent (and our institutional support), both scholars and the public will be better off.  As I’ve shown above, with a little bit of information and attention, there’s no shortage of low-cost, high-quality publishing venues that scholars can use as alternatives to overpriced journals.

October 18, 2010

October 15, 2010

Journal liberation: A community enterprise

Filed under: copyright,discovery,open access,publishing,serials,sharing — John Mark Ockerbloom @ 2:53 pm

The fourth annual Open Access Week begins on Monday.  If you follow the official OAW website, you’ll be seeing a lot of information about the benefits of free access to scholarly research.  The amount of open-access material grows every day, but much of the research published in scholarly journals through the years is still practically inaccessible to many, due to prohibitive cost or lack of an online copy.

That situation can change, though, sometimes more dramatically than one might expect.  A post I made back in June, “Journal liberation: A Primer”, discussed the various ways in which people can open access to journal content, past and present,  one article or scanned volume at a time.  But things can go much faster if you have a large group of interested liberators working towards a common goal.

Consider the New England Journal of Medicine (NEJM), for example.  It’s one of the most prominent journals in the world, valued both for its reports on groundbreaking new research, and for its documentation, in its back issues, of nearly 200 years of American medical history.  Many other journals with lesser value still cannot be read without paying for a subscription, or visiting a research library that has paid for a subscription.  But you can find and read most of NEJM’s content freely online, both past and present. Several groups of people made this possible.  Here are some of them.

The journal’s publisher has for a number of years provided open access to all research articles more than 6 months old, from 1993 onward.  (Articles less than 6 months old are also freely available to readers in certain developing countries, and in some cases for readers elsewhere as well.)  A registration requirement was dropped in 2007.

Funders of medical research, such as the National Institutes of Health, the Wellcome Trust, and the Howard Hughes Medical Institute, have encouraged publishers in the medical field to maintain or adopt such open access policies, by requiring their grantees (who publish many of the articles in journals like the NEJM) to make their articles openly accessible within months of publication.  Some of these funders also maintain their own repositories of scholarly articles that have appeared in NEJM and similar journals.

Google Books has digitized most of the back run of the NEJM and its predecessor publications as part of its Google Books database.  Many of these volumes are freely accessible to the public.  This is not the only digital archive of this material; there’s also one on NEJM’s own website, but access there requires either a subscription or a $15 payment per article.   Google’s scans, unlike the ones on the NEJM website, include the advertisements that appeared along with the articles.  These ads document important aspects of medical history that are not as easily seen in the articles, on subjects ranging from the evolving requirements and curricula of 19th-century medical schools to the early 20th-century marketing of heroin for patients as young as 3 years old.

It’s one thing to scan journal volumes, though; it’s another to make them easy to find and use– which is why NEJM’s for-pay archive got a fair bit of publicity when it was released this summer, while Google’s scans went largely unnoticed.  As I’ve noted before, it can be extremely difficult to find all of the volumes of a multi-volume work in Google Books; and it’s even more difficult in the case of NEJM, since issues prior to 1928 were published under different journal titles.  Fortunately, many of the libraries that supplied volumes for Google’s scanners have also organized links to the scanned volumes, making it easier to track down specific volumes.  The Harvard Libraries, for instance, have a chronologically ordered list of links to most of the volumes of the journal from 1828 to 1922, a period when it was known as the Boston Medical and Surgical Journal.

For many digitized journals, open access stops after 1922, because of uncertainty about copyright.  However, most scholarly journals have public domain content after that date, so it’s possible to go further if you research journal copyrights.  Thanks to records provided by the US Copyright Office and volunteers for The Online Books Page, we can determine that issues and articles of the NEJM prior to the 1950s did not have their copyrights renewed.  With this knowledge, Hathi Trust has been able and willing to open access to many volumes from the 1930s and 1940s.

We at The Online Books Page can then pull together these volumes and articles from various sources, and create a cover page that allows people to easily get to free versions of this journal and its predecessors all the way back to 1812.

Most of the content of the New England Journal of Medicine has thus been liberated by the combined efforts of several different organizations (and other interested people).  There’s still more than can be done, both in liberating more of the content, and in making the free content easier to find and use.  But I hope this shows how widespread  journal liberation efforts of various sorts can free lots of scholarly research.  And I hope we’ll hear about many more  free scholarly articles and journals being made available, or more accessible and usable, during Open Access Week and beyond.

I’ve also had another liberation project in the works for a while, related to books, but I’ll wait until Open Access Week itself to announce it.  Watch this blog for more open access-related news, after the weekend.

July 31, 2010

Keeping subjects up to date with open data

Filed under: data,discovery,online books,open access,sharing,subjects — John Mark Ockerbloom @ 11:51 pm

In an earlier post, I discussed how I was using the open data from the Library of Congress’ Authorities and Vocabularies service to enhance subject browsing on The Online Books Page.  More recently, I’ve used the same data to make my subjects more consistent and up to date.  In this post, I’ll describe why I need to do this, and why doing it isn’t as hard as I feared that it might be.

The Library of Congress Subject Headings (LCSH) is a standard set of subject names, descriptions, and relationships, begun in 1898, and periodically updated ever since. The names of its subjects have shifted over time, particularly in recent years.  For instance, recently subject terms mentioning “Cookery”, a word more common in the 1800s than now, were changed to use the word “Cooking“, a term that today’s library patrons are much more likely to use.

It’s good for local library catalogs that use LCSH to keep in sync with the most up to date version, not only to better match modern usage, but also to keep catalog records consistent with each other.  Especially as libraries share their online books and associated catalog records, it’s particularly important that books on the same subject use the same, up-to-date terms.  No one wants to have to search under lots of different headings, especially obsolete ones, when they’re looking for books on a particular topic.

Libraries with large, long-standing catalogs often have a hard time staying current, however.  The catalog of the university library where I work, for instance, still has some books on airplanes filed under “Aeroplanes”, a term that recalls the long-gone days when open-cockpit daredevils dominated the air.  With new items arriving every day to be cataloged, though, keeping millions of legacy records up to date can be seen as more trouble than it’s worth.

But your catalog doesn’t have to be big or old to fall out of sync.  It happens faster than you might think.   The Online Books Page currently has just over 40,000 records in its catalog, about 1% of the size of my university’s.   I only started adding LC subject headings in 2006.  I tried to make sure I was adding valid subject headings, and made changes when I heard about major term renamings (such as “Cookery” to “Cooking”).  Still, I was startled to find out that only 4 years after I’d started, hundreds of subject headings I’d assigned were already out of date, or otherwise replaced by other standardized headings.  Fortunately, I was able to find this out, and bring the records up to date, in a matter of hours, thanks to automated analysis of the open data from the Library of Congress.  Furthermore, as I updated my records manually, I became confident I could automate most of the updates, making the job faster still.

Here’s how I did it.  After downloading a fresh set of LC subject headings records in RDF, I ran a script over the data that compiled an index of authorized headings (the proper ones to use), alternate headings (the obsolete or otherwise discouraged headings), and lists of which authorized headings were used for which alternate headings. The RDF file currently contains about 390,000 authorized subject headings, and about 330,000 alternate headings.

Then I extracted all the subjects from my catalog.  (I currently have about 38,000 unique subjects.)  Then I had a script check each subject see if it was listed as an authorized heading in the RDF file.  If not, I checked to see if it was an alternate heading.  If neither was the case, and the subject had subdivisions (e.g. “Airplanes — History”) I removed a subdivision from the end and repeated the checks until a term was found in either the authorized or alternate category, or I ran out of subdivisions.

This turned up 286 unique subjects that needed replacement– over 3/4 of 1% of my headings, in less than 4 years.  (My script originally identified even more, until I realized I had to ignore the simple geographic or personal names.  Those aren’t yet in LC’s RDF file, but a few of them show up as alternate headings for other subjects.)  These 286 headings (some of them the same except for subdivisions) represented 225 distinct substitutions.  The bad headings were used in hundreds of bibliographic records, the most popular full heading being used 27 times. The vast majority of the full headings, though, were used in only one record.

What was I to replace these headings with?  Some of the headings had multiple possibilities. “Royalty” was an alternate heading for 5 different authorized headings: “Royal houses”, “Kings and rulers”, “Queens”, “Princes” and “Princesses”.   But that was the exception rather than the rule.  All but 10 of my bad headings were alternates for only one authorized heading.  After “Royalty”, the remaining 9 alternate headings presented a choice between two authorized forms.

When there’s only 1 authorized heading to go to, it’s pretty simple to have a script do the substitution automatically.  As I verified while doing the substitutions manually, nearly all the time the automatable substitution made sense.  (There were a few that didn’t: for instance. when “Mind and body — Early works to 1850″ is replaced by “Mind and body — Early works to 1800“, works first published between 1800 and 1850 get misfiled.  But few substitutions were problematic like this– and those involving dates, like this one, can be flagged by a clever script.)

If I were doing the update over again, I’ll feel more comfortable letting a script automatically reassign, and not just identify, most of my obsolete headings.  I’d still want to manually inspect changes that affect more than one or two records, to make sure I wasn’t messing up lots of records in the same way; and I’d also want to manually handle cases where more than one term could be substituted.  The rest– the vast majority of the edits– could be done fully automatically.  The occasional erroneous reassignment of a single record would be more than made up by the repair of many more obsolete and erroneous old records.  (And if my script logs changes properly, I can roll back problematic ones later on if need be.)

Mind you, now that I’ve brought my headings up to date once, I expect that further updates will be quicker anyway.  The Library of Congress releases new LCSH RDF files about every 1-2 months.  There should be many fewer changes in most such incremental updates than there would be when doing years’ worth of updates all at once.

Looking at the evolution of the Library of Congress catalog over time, I suspect that they do a lot of this sort of automatic updating already.  But many other libraries don’t, or don’t do it thoroughly or systematically.  With frequent downloads of updated LCSH data, and good automated procedures, I suspect that many more could.  I have plans to analyze some significantly larger, older, and more diverse collections of records to find out whether my suspicions are justified, and hope to report on my results in a future post.  For now, I’d like to thank the Library of Congress once again for publishing the open data that makes these sorts of catalog investigations and improvements feasible.

June 11, 2010

Journal liberation: A primer

Filed under: copyright,libraries,open access,publishing,sharing — John Mark Ockerbloom @ 10:07 am

As Dorothea Salo recently noted, the problem of limited access to high-priced scholarly journals may be reaching a crisis point.  Researchers that are not at a university, or are at a not-so-wealthy one, have long been frustrated by journals that are too expensive for them to read (except via slow and cumbersome inter-library loan, or distant library visits).  Now, major universities are feeling the pain as well, as bad economic news has forced budget cuts in many research libraries, even as further price increases are expected for scholarly journals.  This has forced many libraries to consider dropping even the most prestigious journals, when their prices have risen too high to afford.

Recently, for instance, the University of California, which has been subject to significant budget cuts and furloughssent out a letter in protest of Nature Publishing Group’s proposal to raise their subscription fees by 400%.  The letter raised the possibility of cancelling all university subscriptions to NPG, and having scholars boycott the publisher.

Given that Nature is one of the most prestigious academic journals now publishing, one that has both groundbreaking current articles and a rich history of older articles, these are strong words.  But dropping subscriptions to journals like Nature might not be as as much of a hardship for readers as it once might have been.  Increasingly, it’s possible to liberate the research content of academic journals, both new and old, for the world.  And, as I’ll explain below, now may be an especially opportune time to do that.

Liberating new content

While some of the content of journals like Nature is produced by the journal’s editorial staff or other writers for hire, the research papers are typically written by outside researchers, employed by universities and other research institutions.  These researchers hold the original copyright to their articles, and even if they sign an agreement with a journal to hand over rights to them (as they commonly do), they retain whatever rights they don’t sign over.  For many journals, including the ones published by Nature Publishing Group, researchers retain the right to post the accepted version of their paper (known as a “preprint”) in local repositories.  (According to the Romeo database, they can also eventually post the “postprint”– the final draft resulting after peer review, but before actual publication in the journal– under certain conditions.)  These drafts aren’t necessarily identical to the version of record published in the journal itself, but they usually contain the same essential information.

So if you, as a reader, find a reference to a Nature paper that you can’t access, you can search to see if the authors have placed a free copy in an open access repository. If they haven’t, you can contact one of them to encourage them do do so.  To find out more about providing open access to research papers, see this guide.

If a journal’s normal policies don’t allow authors to share their work freely in an open access repository, authors  may still be able to retain their rights with a contract addendum or negotiation.  When that hasn’t worked, some academics have decided to publish in, or review for, other journals, as the California letter suggests.  (When pushed too far, some professors have even resigned en masse from editorial boards to start new journals that are friendlier to authors and readers.

If nothing else, scholarly and copyright conventions generally respect the right of authors to send individual copies of their papers to colleagues that request them.  Some repository software includes features that make such copies extremely easy to request and send out.  So even if you can’t find a free copy of a paper online already, you can often get one if you ask an author for it.

Liberating historic content

Many journals, including Nature, are important not only for their current papers, but for the historic record of past research contained in their back issues.  Those issues may be difficult to get a hold of, especially as many libraries drop print subscriptions, deaccession old journal volumes, or place them in remote storage.  And electronic access to old content, when it’s available at all, can be surprisingly expensive.  For instance, if I want to read this 3-paragraph letter to the editor from 1872 on Nature‘s web site, and I’m not signed in at a subscribing institution, the publisher asks me to pay them $32 to read it in full.

Fortunately, sufficiently old journals are in the public domain, and digitization projects are increasingly making them available for free.  At this point, nearly all volumes of Nature published before 1922 can now be read freely online, thanks to scans made available to the public by the University of Wisconsin, Google, and Hathi Trust.  I can therefore read the letters from that 1872 issue, on this page, without having to pay $32.

Mass digitization projects typically stop providing public access to content published after 1922, because copyright renewals after that year might still be in force.  However, most scholarly journals– including, as it turns out, Nature — did not file copyright renewals.  Because of this, Nature issues are actually in the public domain in the US all the way through 1963 (after which copyright renewal became automatic).  By researching copyrights for journals, we can potentially liberate lots of scholarly content that would otherwise be inaccessible to many. You can read more about journal non-renewal in this presentation, and research copyright renewals via this site.

Those knowledgeable about copyright renewal requirements may worry that the renewal requirement doesn’t apply to Nature, since it originates in the UK, and renewal requirements currently only apply to material that was published in the US before, or around the same time as, it was published abroad.  However, offering to distribute copies in the US counts as US publication for the purposes of copyright law.  Nature did just that when they offered foreign subscriptions to journal issues and sent them to the US; and as one can see from the stamp of receipt on this page, American universities were receiving copies within 30 days of the issue date, which is soon enough to retain the US renewal requirement.  Using similar evidence, one can establish US renewal requirements for many other journals originating in other countries.

Minding the gap

This still leaves a potential gap between the end of the public domain period and the present.  That gap is only going to grow wider over time, as copyright extensions continue to freeze the growth of the public domain in the US.

But the gap is not yet insurmountable, particularly for journals that are public domain into the 1960s.  If a paper published in 1964 included an author who was a graduate student or a young researcher, that author may well be still alive (and maybe even be still working) today, 46 years later.  It’s not too late to try to track authors down (or their immediate heirs), and encourage and help them to liberate their old work.

Moreover, even if those authors signed away all their rights to journal publishers long ago, or don’t remember if they still have any rights over their own work, they (or their heirs) may have an opportunity to reclaim their rights.  For some journal contributions between 1964 and 1977, copyright may have reverted to authors (or their heirs) at the time of copyright renewal, 28 years after initial publication.  In other cases, authors or heirs can reclaim rights assigned to others, using a termination of transfer.  Once authors regain their rights over their articles, they are free to do whatever they like with them, including making them freely available.

The rules for reversion of author’s rights are rather arcane, and I won’t attempt to explain them all here.  Terminations of transfer, though, involve various time windows when authors have the chance to give notice of termination, and reclaim their rights.  Some of the relevant windows are open right now.   In particular, if I’ve done the math correctly, 2010 marks the first year one can give notice to terminate the transfer of a paper copyrighted in 1964, the earliest year in which most journal papers are still under US copyright.  (The actual termination of a 1964 copyright’s transfer won’t take effect for another 10 years, though.)  There’s another window open now for copyright transfers from 1978 to 1985; some of those terminations can take effect as early as 2013.  In the future, additional years will become available for author recovery of copyrights assigned to someone else.  To find out more about taking back rights you, or researchers you know, may have signed away decades ago, see this tool from Creative Commons.

Recognizing opportunity

To sum up, we have opportunities now to liberate scholarly research over the full course of scholarly history, if we act quickly and decisively.  New research can be made freely available through open access repositories and journals.  Older research can be made freely available by establishing its public domain status, and making digitizations freely available.  And much of the research in the not-so-distant past, still subject to copyright, can be made freely available by looking back through publication lists, tracking down researchers and rights information, and where appropriate reclaiming rights previously assigned to journals.

Journal publishing plays an important role in the certification, dissemination, and preservation of scholarly information.  The research content of journals, however, is ultimately the product of scholars themselves, for the benefit of scholars and other knowledge seekers everywhere.   However the current dispute is ultimately resolved between Nature Publishing Group and the University of California, we would do well to remember the opportunities we have to liberate journal content for all.

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.