Everybody's Libraries

January 1, 2014

Public Domain Day 2014: The fight for the public domain is on now

Filed under: copyright,data,open access,sharing — John Mark Ockerbloom @ 2:42 pm

New Years’ Day is upon us again, and with it, the return of Public Domain Day, which I’m happy to see has become a regular celebration in many places over the last few years.  (I’ve observed it here since 2008.)  In Europe, the Open Knowledge Foundation gives us a “class picture” of authors who died in 1943, and whose works are now entering the public domain there and in other “life+70 years” countries.  Meanwhile, countries that still hold to the Berne Convention’s “life+50 years” copyright term, including Canada, Japan, and New Zealand, and many others, get the works of authors who died in 1963.  (The Open Knowledge Foundation also has highlights for those countries, where Narnia/Brave-New-World/purloined-plums crossover fanfic is now completely legal.)  And Duke’s Center for the Study of the Public Domain laments that, for the 16th straight year, the US gets no more published works entering the public domain, and highlights the works that would have gone into the public domain here were it not for later copyright extensions.

It all starts to look a bit familiar after a few years, and while we may lament the delays in works entering the public domain, it may seem like there’s not much to do about it right now.  After all, most of the world is getting another year’s worth of public domain again on schedule, and many commentators on the US’s frozen public domain don’t see much changing until we approach 2019, when remaining copyrights on works published in 1923 are scheduled to finally expire.  By then, writers like Timothy Lee speculate, public domain supporters will be ready to fight the passage of another copyright term extension bill on Congress like the one that froze the public domain here back in 1998.

We can’t afford that sense of complacency.  In fact, the fight to further extend copyright is raging now, and the most significant campaigns aren’t happening in Congress or other now-closely-watched legislative chambers.  Instead, they’re happening in the more secretive world of international trade negotiations, where major intellectual property hoarders have better access than the general public, and where treaties can be used to later force extensions of the length and impact of copyright laws at the national level, in the name of “harmonization”.   Here’s what we currently have to deal with:

Remaining Berne holdouts are being pushed to add 20 more years of copyright.  Remember how I said that Canada, Japan, and New Zealand were all enjoying another year of “life+50 years” copyright expirations?  Quite possibly not for long.  All of those countries are also involved in the Trans-Pacific Partnership (TPP) negotiations, which include a strong push for more extensive copyright control.  The exact terms are being kept secret, but a leaked draft of the intellectual property chapter from August 2013 shows agreement by many of the countries’ trade negotiators to mandate “life+70 years” terms across the partnership.  That would mean a loss of 20 years of public domain for many TPP countries, and ultimately increased pressure on other countries to match the longer terms of major trade partners.  Public pressure from citizens of those countries can prevent this from happening– indeed, a leak from December hints that some countries that had favored extensions back in August are reconsidering.  So now is an excellent time to do as Gutenberg Canada suggests and let legislators and trade representatives know that you value the public domain and oppose further extensions of copyright.

Life+70 years countries still get further copyright extensions.   The push to extend copyrights further doesn’t end when a country abandons the “life+50 years” standard.  Indeed, just this past year the European Union saw another 20 years added on to the terms of sound recordings (which previously had a 50-year term of their own in addition to the underlying life+70 years copyrights on the material being recorded.)  This extension is actually less than the 95 years that US lobbyists had pushed for, and are still pushing for in the Trans-Pacific Partnership, to match terms in the US.

(Why does the US have a 95-year term in the first place that it wants Europe to harmonize with?  Because of the 20-year copyright extension that was enacted in 1998 in the name of harmonizing with Europe.  As with climbers going from handhold to handhold and foothold to foothold higher in a cliff, you can always find a way to “harmonize” copyright ever upward if you’re determined to do so.)

The next major plateau for international copyright terms, life+100 years, is now in sight.  The leaked TPP draft from August also includes a proposal from Mexico to add yet another 30 years onto copyright terms, to life+100 years, which that country adopted not many years ago.  It doesn’t have much chance of passage in the TPP negotiations, where to my knowledge only Mexico has favored the measure.   But it makes “life+70″ seem reasonable in comparison, and sets a precedent for future, smaller-scale trade deals that could eventually establish longer terms.  It’s worth remembering, for instance, that Europe’s “life+70″ terms started out in only a couple of countries, spread to the rest of Europe in European Union trade deals, and then to the US and much of the rest of the world.  Likewise, Mexico’s “life+100″ proposal might be more influential in smaller-scale Latin American trade deals, and once established there, spread to the US and other countries.  With 5 years to go before US copyrights are scheduled to expire again in significant numbers, there’s time for copyright maximalists to get momentum going for more international “harmonization”.

What’s in the public domain now isn’t guaranteed to stay there.  That’s been the case for a while in Europe, where the public domain is only now getting back to where it was 20 years ago.  (The European Union’s 1990s extension directive rolled back the public domain in many European countries, so in places like the United Kingdom, where the new terms went into effect in 1996, the public domain is only now getting to where it was in 1994.)  But now in the US as well, where “what enters the public domain stays in the public domain” has been a long-standing custom, the Supreme Court has ruled that Congress can in fact remove works from the public domain in certain circumstances.   The circumstances at issue in the case they ruled on?  An international trade agreement– which as we’ve seen above is now the prevailing way of getting copyrights extended in the first place.   Even an agreement that just establishes life+70 years as a universal requirement, but doesn’t include the usual grandfathered exception for older works, could put the public domain status of works going back as far the 1870s into question, as we’ve seen with HathiTrust international copyright determinations.

But we can help turn the tide.  It’s also possible to cooperate internationally to improve access to creative works, and not just lock it up further.  We saw that start to happen this past year, for instance, with the signing of the Marrakesh Treaty on copyright exceptions and limitations, intended to ensure that those with disabilities that make it hard to read books normally can access the wealth of literature and learning available to the rest of the world.  The treaty still needs to be ratified before it can go into effect, so we need to make sure ratification goes through in our various countries.  It’s a hopeful first step in international cooperation increasing access instead of raising barriers to access.

Another improvement now being discussed is to require rightsholders to register ongoing interest in a work if they want to keep it under copyright past a certain point.  That idea, which reintroduces the concept of “formalities”, has been floated some prominent figures like US Copyright Register Maria Pallante.  Such formalities would alleviate the problem of “orphan works” no longer being exploited by their owners but not available for free use.   (And a sensible, uniform formalities system could be simpler and more straightforward than the old country-by-country formalities that Berne got rid of, or the formalities people already accept for property like motor vehicles and real estate.)  Pallante’s initial proposal represents a fairly small step; for compatibility with the Berne Convention, formalities would not be required until the last 20 years of a full copyright term.  But with enough public support, it could help move copyright away from a “one size fits all” approach to one that more sensibly balances the interests of various kinds of creators and readers.

We can also make our own work more freely available.  For the last several years, I’ve been applying my own personal “formalities” program, in which I release into the public domain works I’ve created that I don’t need to further limit.  So in keeping with the original 14-year renewable terms of US copyright law, I now declare that all work that I published in 1999, and that I have sole control of rights over, is hereby dedicated to the public domain via a CC0 grant.  (They join other works from the 1900s that I’ve also dedicated to the public domain in previous years.)  For 1999, this mostly consists of material I put online, including all versions of  Catholic Resources on the Net, one of the first websites of its kind, which I edited from 1993 to 1999.  It also includes another year’s history of The Online Books Page.

Not that you have to wait 14 years to free your work.  Earlier this year, I released much of the catalog data from the Online Books Page into the public domain.  The metadata in that site’s “curated collection” continues to be released as open data under a CC0 grant as soon as it is published, so other library catalogs, aggregators, and other sites can freely reuse, analyze, and republish it as they see fit.

We can do more with work that’s under copyright, or that seems to be.  Sometimes we let worries about copyright keep us from taking full advantage of what copyright law actually allows us to do with works.  In the past couple of years, we saw court rulings supporting the rights of Google and HathiTrust to use digitized, but not publicly readable, copies of in-copyright books for indexing, search, and preservation purposes.   (Both cases are currently being appealed by the Authors Guild.)  HathiTrust has also researched hundreds of thousands of book copyrights, and as of a month ago they’d enabled access to nearly 200,000 volumes that were classified as in-copyright under simple precautionary guideliness, but determined to be actually in the public domain after closer examination.)

In the coming year, I’d like to see if we can do similar work to open up access to historical journals and other serials as well.  For instance, Duke’s survey of the lost public domain mentions that articles from 1957 major science journals like Nature, Science, and JAMA are behind paywalls, but as far as I’ve been able to tell, none of those three journals renewed copyrights for their 1957 issues.  Scientists are also increasingly making current work openly available through open access journals, open access repositories, and even discipline-wide initiatives like SCOAP3, which also debuts today.

There are also some potentially useful copyright exemptions for libraries in Section 108 of US copyright law that we could use to provide more access to brittle materials, materials nearing the end of their copyright term, and materials used by print-impaired users.

Supporters of the public domain that sit around and wait for the next copyright extension to get introduced into their legislatures are like generals expecting victory by fighting the last warThere’s a lot that public domain supporters can do, and need to do, now.  That includes countering the ongoing extension of copyright through international trade agreements, promoting initiatives to restore a proper balance of interest between rightsholders and readers, improving access to copyrighted work where allowed, making work available that’s new to the public domain (or that we haven’t yet figured out is out of copyright), and looking for opportunities to share our own work more widely with the world.

So enjoy the New Year and the Public Domain Day holiday.  And then let’s get to work.

August 23, 2013

March 22, 2013

Updates on library linking, Wikipedia, and what you can do

Filed under: discovery,libraries,sharing — John Mark Ockerbloom @ 4:55 pm

I’m gratified for the positive response I’ve been getting to the Forward To Libraries service I first introduced last month.  It really took off when I announced the templates for linking to libraries from Wikipedia a couple of weeks ago.   They’ve been written up in places like Boing Boing and in Wikipedia’s own Signpost newsletter.   The service now includes more than 150 libraries throughout the English-speaking world.  Various Wikipedia editors are also adding the link templates to various articles–  besides the handful I added myself, more than 450 have been added by other editors at this writing.  And I’ve heard from numerous librarians who now want to start editing Wikipedia themselves, both to add library links and to otherwise improve articles.  (Here’s how to become a Wikipedia editor.)

So far, I’ve largely provided this service on my own, with support from the University of Pennsylvania Libraries.   But I’d like to make the service more useful, and could use some help.  If you’re interested, here are some things you might want to know:

Some libraries are easier to link than others.   If you’re using one of many standard library catalogs or discovery systems, and you haven’t made substantial modifications to it, it’s easy for me to add your system. I basically just record what software you’re using and where on the Web the service runs, run some test searches to verify your system, and you’re good to go.  If you’re using a more customized, obscure, or home-grown system, I might still be able to add links to it, but it may take me more effort to figure out how to make useful search links into the system.  Any information you can provide would be helpful.  There are also certain off-the-shelf systems that I have problems with.  Many Polaris systems, for example, will give a “session timed out” message the first time you try to follow a search link into the system.   (Back up and try the link again, and everything will be fine for some time afterwards.)  Some other systems don’t seem to support deep search links in any consistent way that I’ve been able to determine, and not just some very old session-based systems, but also EBSCO’s fairly new EDS discovery platform.

I’ve determined ways to link into these various systems from reading various documentation files I’ve found on the public Internet, along with some reverse-engineering of public web sites.  If you know of better ways to link to some of these systems that I haven’t yet figured out myself, and this information can be made public, let me know.

For now, I’m declining to list libraries that don’t have many English-language subject or Library of Congress name headings, because the results of English searches in those libraries will be misleadingly incomplete.  But I’m considering ways to include translated searches, where the data to support this is available, for a wider range of countries.  (VIAF already provides much relevant data for names.)

The most popular new Wikipedia Library resource template is also controversial, and might be modified or deleted.   I provide a number of different templates for linking from Wikipedia to libraries, including the inlined text templates “Library resources about” and “Library resources by“, and the all-in-one sidebar template “Library resources box“. By far the most used of these templates has been the Library resources box.   It’s easy to spot in an article, it organizes links clearly, and it’s easy for editors to recognize as a template that they can add to articles they find of interest.  But some Wikipedians, including at least one Wikipedia admin, have objected to the template.  They cite style guidelines that say external link templates should not use boxes or other graphical elements, but only appear as inlined text.  I’ve defended the boxes, noted how other library-related external links commonly appear in boxes, and proposed ways to address various Wikipedian concerns.   But it’s ultimately up to the Wikipedia community to determine whether or how library links will appear in Wikipedia articles.  To find out more about the issues, see the Library resources box talk page.  And if you’re a Wikipedia editor or user, feel free to weigh in on that page or other relevant forums.

I’m exploring ways to make it easier for readers to get to our libraries.  For one, I’m starting to record IP ranges for some institutions, so that local network users can follow “resources in your library” links straight to the institution’s library, without having to first register a preference.  (Users can still register a different preference if they want.)  IP-based routing is an experimental service, initially being provided to a limited number of institutions, and I may modify or withdraw it in the future.  If you’d like me to consider it for your institution, you can submit a request, with the relevant IP ranges (preferably in CIDR format) in the “anything we should know?” field.  Note that the IP ranges you submit will be published as part of the library data I’m sharing for this project.

I’m starting to share my work on Github.  There is now a Github repository with selected data and code for the FTL project.  In it, you’ll find the data I use to link to the libraries enrolled in the service, and you’ll also see the code for the main CGI script used to forward readers to those libraries.   You can’t yet run the service out of the box yourself with the code and data provided so far, but I hope that what’s there will help people understand how the service works, and possibly implement similar services themselves if they’re so inclined.  The data’s released under CC0, so you can reuse it however you like; and the code is open-source licensed under the Educational Community License 2.0.  I hope to add more data and code over time, and I’m happy to hear suggestions for enhancements and improvements.

I’m hoping that as more people get involved, the service will improve, library resources will become more reachable online, and Wikipedia will become a more useful resource as well.  If you’d like to get involved yourself, I’d love to hear what you’re up to, and what suggestions you might have.

January 1, 2013

Public Domain Day 2013: or, There and Back Again

Filed under: copyright,online books,open access,sharing — John Mark Ockerbloom @ 7:48 am

The first day of the new year is Public Domain Day, when many countries celebrate a year’s worth of copyrights expiring, and the associated works become freely available for anyone to share and adapt.  As the Public Domain Day page at Duke’s Center for the Public Domain notes, the United States once again does not have much to celebrate.  Except for unpublished works by authors who died in 1942, no copyrights expire in the US today.  Under current law, Americans still have to wait 6 more years before any more copyrights of published works will expire.  (Subsisting copyrights from 1923 are scheduled to finally enter the public domain at the start of 2019.)

The start of 2013 is more significant in Europe, where the Open Knowledge Foundation has a more upbeat Public Domain Day site featuring authors who died in 1942, and whose published works enter the public domain today in most of the European Union. But that isn’t actually breaking new ground in most of Europe, because 2013 is also the 20th anniversary of the 1993 European Union Copyright Duration Directive, which required European countries to retroactively extend their copyright terms from the Berne Convention‘s “life of the author plus 50 years” to “life of the author plus 70 years”, and put 20 years’ worth of public domain works back into copyright in those countries.

For countries that used the Berne Convention’s term and implemented the directive right away, today marks the day that the public domain finally returns to its maximum extent of 20 years ago.  Only next year will Europe start seeing truly new public domain works.  (And since many European countries took a couple of years or more to implement the directive– the UK implemented it at the start of 1996, for instance– it may still be a few years yet before their public domain is back again to what it once was.)

At least the last US copyright extension, in 1998, only froze the public domain, without rolling it back.  If the US had not passed that extension, we would be seeing works published in 1937, such as the first edition of J.R.R. Tolkien’s The Hobbit, now entering the public domain.  (If the US hadn’t made any post-publication extensions, we’d also have the more familiar revision of The Hobbit, in which Gollum does not voluntarily give Bilbo the Ring, in the public domain now as well, along with all three volumes of The Lord of the Rings.)   Folks in Canada and other “life+50 years” countries, now celebrating the public domain status of works by authors who died in 1962, may be able to freely share and adapt Tolkien’s works in another 11 years.  Folks in Europe and the US who’d like to see a variety of visual adaptations, though, will have to content themselves with the estate-licensed Peter Jackson and Rankin/Bass adaptations for a while to come.

But there are still things Americans can do to make today meaningful.  For the last few years, I’ve been releasing copyrights I control into the public domain after 14 years (the original term of copyright set by the country’s founders, with an option to renew for another 14).  So today, I dedicate all such copyrights for works I published in 1998 to the public domain.  This includes my computer science doctoral dissertation, Mediating Among Diverse Data Formats.  If I believed a recent fearmongering statement from certain British journal editors, I should be worried about plagiarism resulting from this dedication, which doesn’t even have the legal attribution requirement of the CC-BY license they decry.  But as I’ve explained in a previous post on plagiarism, plagiarism is fundamentally an ethical rather than a legal matter, and scholars can no more get away with plagiarizing public domain material than they can with copyrighted material.   Both are and should be a career-killer in academia.

I’ll also continue to feature “new” public domain works from around the world on The Online Books Page.  Starting today, for instance, I’ll be listing works featured in The Public Domain Review, a wonderful ongoing showcase of public domain works inaugurated by the Open Knowledge Foundation on Public Domain Day 2011.  I’ll also be continuing to add listings from Project Gutenberg Canada and other sites in “life+50 years” countries, as well as other titles suggested by my readers.

Finally, I’ll be keeping a close eye on Congress’s actions on copyright.  In this past year, the Supreme Court ruled that Congress could take works out of the public domain, meaning that the public domain in the US is now under threat of shrinking, and not just freezing.  And the power of the copyright lobby was evident this year when a Republican Study Committee memo recommending copyright reform (including shorter terms) was yanked within 24 hours of its posting, and its author then fired.  On the other hand, 2012 also saw one of the largest online protests in history stop a copyright lobby-backed Internet censorship bill in its tracks.  If the public shows that it cares as much about the public domain as about bills like SOPA, we could have a growing public domain back again before long, instead of works going back again into copyright.

July 22, 2012

Building on a full complement of copyright records

Filed under: citizen librarians,copyright,open access,sharing — John Mark Ockerbloom @ 12:22 pm

Thanks to recent efforts of the US Copyright Office, we now have a complete digitization of summary copyright registration and renewal records back to the late 19th century.  As Mike Burke and others at the Copyright Office have been reporting on their blog, Copyright Matters: Digitization and Public Access, the Copyright Office has now digitized nearly every volume of the Catalog of Copyright Entries, and its predecessor publication, the Catalogue of Title Entries of Books and Other Articles, to the start of that serial in 1891.  Combined with the current online Copyright Catalog database, and some independent scans that fill in gaps in the Copyright Office set, records for every copyright registration and renewal still in force in the US can now be found online, free of charge.

This is a great benefit for people wanting to make better use of copyrighted works and the public domain.  With the information now online, we can quickly verify copyright and public domain status for lots of works, and also get useful leads on current owners of copyrights, in ways that were not possible when the only copies of the Catalog were in closed reserve at certain federal depository libraries.  Various people in the Copyright Office  have been hoping for a while to get approval and funding for this digitization, and I’m very thankful for their persistence in seeing the work through.

Not all the work is done, though.  Although the Catalog is now online, its records are not as easy to search, navigate through, and interpret as they could be.  There’s no one-stop search box, for instance, that will reliably bring you to any copyright record with your query terms, regardless of date or type of record.  And the Copyright Office also has more information about its copyright registrations– some of it on catalog cards, and more of it on original registration certificates like the one I found when researching the status of my mother’s book– that could be useful to people researching copyright status and looking for rightsholders.

For now, the Copyright Office is scanning the cards used to look up volumes of registration certificates, and that are also the basis of the Catalog of Copyright Entries printed volumes.  From my (limited) experience with these cards, they don’t seem to add much information to what’s in the printed Catalog, but it’s easier to automatically create a searchable, structured database of copyright records from the cards, with their fairly regular typefaces and formats, than it would be to create one from the Catalog scans.  According to their latest blog post, the Copyright Office is now creating digital images of the relevant cards, and hope to be done by the end of Fiscal Year 2014, or a little over 26 months from now.  They’re also hoping to work with various partners– including “crowdsourcing” partnerships– to reliably convert the information on the cards into machine-readable form.

There are also lots of ways to make the existing online records more useful.   On my own copyright records site, for instance, I’ve now made a comprehensive index to all the Catalog volumes, and created a table to make it easier to look up records in digitized Catalog volumes, based on the year and type of copyright registration.  I’m still working on further refinements, and would be very happy to hear suggestions.  (I’ve also been unable to find one 12-month stretch of records for copyrights from 1895 and 1896.  Fortunately, all the copyrights from those years have long since expired, but I’d still be grateful to anyone who can help me fill this last gap.)

At the same time, I’ve been using the comprehensive record set to help me research and publicize copyright status for listings on The Online Books Page.  For instance, if I’m listing public domain issues of a journal, magazine, or other serial, I’ll also look to see whether additional issues might also be in the public domain if their copyrights were not renewed.  Then I’ll place a note about this on my cover page for the serial, if applicable.

As for the Copyright Office, I’m hoping that they can soon start digitizing their volumes of registration certificates, which contain a lot of useful additional information about copyrights and copyright holders, and which no one else has.  Digitizing all of them wouldn’t be cheap– there are a lot of pages potentially to digitize, usually two for each registration.  But perhaps they could start digitizing incrementally, either on a prioritized systematic basis (e.g., starting with the most recent volumes), or on a demand-based basis (e.g., digitizing when someone wants to obtain a copy of one of a volume’s certificates).

These are only a few of the things that could be done with the records now online, by people anywhere with the suitable motivation.  I’d love to hear what others are doing or thinking of doing.

May 28, 2012

Finding the (market) value in freeing books

Filed under: online books,open access,sharing — John Mark Ockerbloom @ 8:45 pm

I list a number of books on The Online Books Page that are relatively recent copyrighted books  put online with the permission of the copyright holder.  I am very thankful to the authors and other rightsholders who have agreed to share their works with the world.

But although I’ve been avidly collecting such listings, along with a much larger number of pubic domain listings, for nearly 20 years, the vast majority of copyrighted books have not been made freely readable online by their rightsholders. One of the more significant reasons is that many writers (particularly those who depend on writing income for a living) hope to earn money from selling their work rather than giving it away.  Indeed, as the market for selling ebooks has grown, I’ve occasionally needed to delist titles where the author or publisher withdrew their free copies in favor of selling Kindle or Nook editions.  Such sales can provide some income, but at the cost of a reduced audience. Meanwhile, many other books remain both offline and out of print, not available at all except in a limited number of used and library copies.  This situation isn’t good either for many readers (who may have a hard time finding or reading the book) or for authors (who, at least in the US, make no money from used book sales or library loans).

Eric Hellman and his colleagues at Gluejar think they can improve matters for both readers and writers:  Find a way to pay authors or other rightsholders to make their books freely readable (and possibly adaptable) online.  In particular, use the Internet to pull together a crowd of supporters for books no longer readily available, who collectively pay for the rights to make the work openly accessible.  If they succeed, the author gets some more money, and the world gets the gift of a book to freely read and share.  In such a crowd-funded market for freeing books, everybody can end up better off than they were before.

Projects like Kickstarter, in which online funders collectively supported more than 10,000 new creative projects last year, show that such crowd-funding can be valuable, viable and scalable.  But every kind of market is different, and the dot-com boom and bust showed that certain kinds of online marketplaces were wildly successful, and others, not so much.  You don’t really know what a new kind of market is going to look like until you have a go at creating it.

So I was pleased and intrigued when Gluejar launched Unglue.it on May 17, with campaigns to collectively buy rights to make five out-of-print books freely available to the world.  The opening campaigns are still in progress, but after 11 days and a fair bit of publicity, we can start to get an idea of how this particular market is developing.

The first thing I notice is that the initial funding goals in many cases erred on the side of optimism.  The goal prices for rights to the first batch of books, all of which appear to be previously published but now out of print, range from $7,500 to $50,000, but only one of the books has managed to raise more than $500 in its first 11 days.  Unless pledges pick up substantially (or the goals are lowered, something that the rules appear to allow), most of these books aren’t going to get funded.

However, one of the initial Unglue.it books, Oral Literature in Africa, stands a decent chance at meeting its goal.  Currently 34% of the way there with 24 days to go, it’s raised more money in pledges as I write this than all the other titles combined.  You could attribute some of the difference to better publicity (being featured in Boing Boing certainly doesn’t hurt). But all of the titles have had a fair chance to get promoted, and other titles, like Riverwatch, have gotten special attention in other blogs.)  I think it’s more likely that the nature of the book, and the terms of the offer, make a big difference in its ability to attract sponsors. Three particular things about this book stand out for me:

  1. The book addresses an ongoing interest in a way that is not readily substitutable.  While oral literature in Africa is not the most popular subject imaginable, a sizable number of scholars and ordinary readers around the world take an interest in African culture and heritage.  For such interested readers, the book occupies a unique niche.  It is, according to its campaign’s description, the book that “single-handedly created the field of ethnography of language”.  Such landmark books can remain valuable reference points even after their fields have advanced well beyond where they were when the book first came out.   Furthermore, the book collects numerous examples of oral literature (and the unglued version will include more such material, in audio as well as textual form); these specific works can be important texts for study that cannot be readily overlooked.  Someone who’s interested generally in horror novels or beginning readers can choose from many different books to fill that need beside the two books that are campaigning in these areas..  But someone who wants to understand African oral literature, and its analysis by English-speaking scholars, would have a hard time passing up this book.
  2. The book is offered with a license that maximally encourages reuse and enhancement.  Three of the other initial Unglue.it books are being offered with a Creative Commons Attribution-NonCommercial-NoDerivatives license.  This lets people freely read the books, and pass them around as long as they don’t charge for them, but that’s about it; no adaptations, updates, sequels, fan-fiction, or the like.  One other book, Cat and Rat,  offers the Attribution-NonCommercial-ShareAlike license, which does allow adaptation, but puts some restrictions on how version and adaptations of the book can be distributed or reused.  Oral Literature in Africa, though, is offered with a license that’s more liberal than any of the other books.  Its Creative Commons Attribution license allows for virtually any kind of adaptation, enhancement, or distribution, as long as proper credit is given to the original source and author.  The ability to adapt and enhance a work is particularly important for nonfiction.  As I noted last year when discussing the Digital Public Library of America, the useful lifespan of nonfiction works can go way up when they can be updated and adapted to new knowledge and needs.  A license permitting that, and the promised release of additional material under similar terms, make the freeing of Oral Literature in Africa especially desirable.
  3. The book is offered with a realistic goal price. Despite being arguably the most attractive book in the first set based on the criteria above, Oral Literature in Africa was offered with a lower price goal than any of the other books.  It’s not a trivial goal– as of this writing, nearly 2/3 of the funds still need to be raised– but it seems to be well-matched with its demand.  According to Amazon, used copies of this book are being sold starting at just over $20.  At that price level, it will take 375 readers– a reasonable number for an academic title– willing to pay an average of that amount to free the book.  In contrast, the going prices for the other books are quite a bit lower.  Three of the four other books are offered starting at $0.01 plus shipping; the fourth does not appear in Amazon’s used market, but is offered in a Kindle edition for $9.99.   (Even at that price point, it would require more than 5,000 “buyers” to meet its current goal.)  Personally, I’m more likely to support a book liberation campaign if I know my pledge is likely to make a difference, and the campaign is likely to succeed.  Right now, that looks distinctly possible for Oral Literature in Africa, but not for the others.

Mind you, there’s no guarantee that any of these initial books will meet its goal.  Oral Literature in Africa will make it if pledges don’t drop from the level they came in for the first 10 days, but will miss the mark if pledges drop off significantly.  (I pledged $25 to support this book two days ago; since then it’s received less than $100 in additional pledges.  However, this was over a holiday weekend; things may pick up again during the week.) If the campaign, or one or more of the others, does succeed, we’ll have our first baseline for this new book-liberation market.

But one book doesn’t make a market.  The true test of the “ungluing” idea will be to see how many other books come after it.  Will there be enough to make ungluing a significant new source of freely readable and adaptable copyrighted books?   Will there be enough commission revenue from enough campaigns to support enterprises like Gluejar as businesses?  That remains to be seen.  I haven’t yet seen any other books join the initial 5 that Gluejar has offered, but I wouldn’t be at all surprised if other rightsholders are closely watching these first campaigns, planning to decide based on how they turn out.

From what I’ve seen so far, if a rightsholder has a particularly distinctive, not easily substitutable book, offers licenses or other premiums attractive enough to interest a sizable support base, and is realistic about revenue expectations, they could well enjoy new revenue from books they’ve already written– and let new generations of readers enjoy and build on their work at the same time.  We’ll see how many of them find that a worthwhile bargain.

October 7, 2011

My mother’s orphan

Filed under: copyright,findingada,online books,open access,people,preservation,sharing,teaching — John Mark Ockerbloom @ 5:06 pm

Before my mother was pregnant with me, she was working on a book.

The book had begun its gestation at least a year before. She had been teaching math in Massachusetts, and was involved with the Madison Project, one of the initiatives that arose from the “new math” movement of the 1960s.  What excited her, and what I caught from her not long after I was born, was the sense of discovery and play that was encouraged in the Madison teaching style.  The primary focus wasn’t so much on imparting and drilling facts and rules, or on mundane applications, but on finding patterns, solving puzzles, and figuring out the secrets of numbers and geometry and the other mathematical constructs that underlie our world. Some project participants planned a series of books that would help bring out this sense of discovery and exploration in math classes.

Two small children in the house may have delayed my mother’s ambitions, but we didn’t stop her.  When I was in kindergarten, the piles of papers in my parents’ bedroom went away, and my mother proudly showed me her new book.  The book, Discoveries in Essential Mathematics, was co-written with Ramon Steinen, and published by Charles E. Merrill. Though the textbook was written for middle schoolers, I remember reading through the book after my mother showed it to me, solving the simpler problems, and smiling when I saw my name or my sister’s in an example.

She got small royalty checks for a few years, but the book was out of print by the late 1970s, never reaching a second edition.  We kept some copies in our basement, but I didn’t know of any library that held it.  When I visited the Library of Congress as a middle schooler, wrongly convinced that they had every book ever published, I remember my disappointment when I couldn’t find Mom’s book in their card catalog.

My mother eventually retired from teaching, and the enthusiasm and talent I’d gotten from Mom for math shifted into computing, and then into digital libraries.  And when my kids reached school age, I decided to try putting her book online.  In an era of large classes, detailed state standards, and high-stakes standardized tests, it might not be a viable standard textbook any more, but I think it’s still great for curious kids who show an interest in math.

Mom thought that was a great idea.  But she didn’t know if she could grant permission on her own.  Although long out of print, the book’s copyright had automatically renewed in 2000 under US copyright law, and she wasn’t sure if she had to get the consent of her publisher or co-author before she could give me the go-ahead. She didn’t know how to reach her co-author, and her old imprint was long gone.  Even its acquirer had itself been acquired by a large conglomerate some time ago.  So I let the idea drop, thinking I’d come back to it later when I had a little time to research the copyright.

But not long after, she started a long slide into dementia, and was soon in no position to give permission to anyone.  If her book had been practically an “orphan work” before, due to uncertainty over rights, it was even more so now.  There was no trouble locating the author; but no way of getting valid permission from someone definitely known to hold the rights.

Mom died this past winter, four years after my Dad had reluctantly moved her into the nursing home for good, and four weeks after he’d made his usual daily visit, gone back home, and had a fatal heart attack.  After we paid the last of the bills, and threw out the contents of the basement (where a burst pipe ruined all the books, papers, and other things they kept down there), what remained of what they had would now go to me and my siblings.

I still had a copy at home of the teacher’s edition of Mom’s book that she had once given to Grandma.  And between my mother’s funeral and the burst pipe, I’d taken a student edition out of their basement for my kids to read.  But any faint hope of finding publishing contracts or rights assignment documents was obliterated after the pipe burst.  The basic questions were: had Mom signed her rights to the book away, as many academic authors do? If so, had she gotten them back at some point?  Or had she never had the rights in the first place, as sometimes happens with textbook authors under “work for hire” contracts?

The copyright page of the book, and the record in the 1972 Catalog of Copyright Entries, show the publisher as the copyright claimant, so I couldn’t assume she had the rights.   But I also doubted whether I could get a clear answer, or reasonable licensing terms, from the company that had eventually acquired the assets of Mom’s original publisher.

I eventually found what I needed to know on a trip to Washington, DC.  While attending a meeting on digital format registries, I realized that I was in the same building as the Copyright Office.   So after the meeting, I got a reader’s card, went upstairs, and consulted the librarians there.  We confirmed that, under the automatic renewal laws of the time, the copyright to Mom’s book would have reverted in 2000 to whoever had been declared the “author” in the book in the original registration record.   Moreover, in the absence of any contrary arrangement, any co-owner of a copyright can authorize publication, as long as they split any proceeds with the other copyright owners.

Since I was planning just to put the book online for free, the only question remaining was: who was listed as the author on the original registration: the publisher who claimed the copyright, or my mother and Dr. Steinen?  It’s not clear from the Catalog of Copyright Entries, but the original registration certificate would state it.  And the one copy known to exist of that certificate was in the archives of the Copyright Office where I was sitting.

Twenty minutes later, I had the certificate in front of me.  The name on the “claimant” line was indeed the publisher’s, but the names on the “author” line were Steinen and Ockerbloom.  My mother’s orphan was mine to claim.

There are a lot more books out there like hers.  Since I added records for Hathi Trust‘s public domain books to The Online Books Page, I’ve gotten requests to curate hundreds of out of print, largely forgotten books that are still meaningful to readers online.  Many of the people who opt to leave contact information  live in places where  books tend to be hard to get or pay for. Many others, judging from their names, seem to be related to the authors of the books they suggest. These readers have found the books after Hathi, or Google, or the Internet Archive, has resurfaced them online, and the readers want these books to live on.  If there were an easy, inexpensive, uncontroversially legal way to also bring back books that are still in copyright, but no longer commercially exploited, I’m sure I could fulfill a lot of requests for those books too.

For now, though, I’ll bring back the one orphan book I’ve been given. And I thank my mother for writing it, and the other women and men who have poured so much of their energy and teaching into their books, and the librarians of all kinds who help ensure those books stay accessible to readers who value them.  I’ll try my best to keep your legacies alive.

September 23, 2011

Early journals from JSTOR and others

Filed under: copyright,open access,serials,sharing — John Mark Ockerbloom @ 11:26 am

Earlier this month,  JSTOR announced that it would provide  free open access to their earliest scholarly journal content, published before 1923.  All of this material should be old enough to be in the public domain.  (Or at least it is in the US.  Since copyrights can last longer elsewhere, JSTOR is only showing pre-1870 volumes openly outside the US.)  I was very pleased to hear they would be opening up this content; it’s something I’d asked them to consider ever since they ended a small trial of open, public domain volumes in their early years.

Lots of early  journal content now openly readable online

The time was ripe to open access at JSTOR.  (And not just because of growing discontent over limited access to public domain and publicly funded research.) Thanks to mass-digitization initiatives and other projects, much of the early journal content found in JSTOR is now also available from other sources.  For instance, after Gregory Maxwell posted a torrent of pre-1923 JSTOR volumes of the Philosophical Transactions of the Royal Society of London, I surveyed various free digital text sites and found nearly all the same volumes, and more, available for free from Hathi Trust, Google, the Internet Archive, Gallica, PubMed Central, and the Royal Society itself.  The content needed to be organized to be usefully browsable across sites, but that required a bit of basic librarianship and a bit of time.

Philosophical Transactions is not an anomaly.  After collating volumes of this journal, I looked at the first ten journals that signed on to JSTOR back in the mid-1990s.  (The list can be found below.)  I again found that nearly all of pre-1923 content of these journals was also available from various free online sites.  Now, when you look them up on The Online Books Page, you’ll find links to both the JSTOR copies and the copies at other sites.

Comparing the sites that provide this content is enlightening.  In general, the JSTOR copies are better presented,  with article-level tables of contents, cross-volume searching, article downloads, and consistently high scan quality.  But the copies at other sites are generally usable as well, and sometimes include interesting non-editorial material, such as advertisements, that might not be present in JSTOR’s archive.  By opening up access to its early content now, though, JSTOR will remain the preferred access point to this early content for most researchers — and that, hopefully, will help attract and sustain paid support for the larger body of scholarly content that JSTOR provides and preserves for its subscribers.

And there’s a lot more in the public domain

JSTOR currently only provides open access for volumes up to 1922 (or up to 1869, if you’re not in the US).   But there’s lots more public domain journal content that can be made available.  Looking again at the initial ten JSTOR journals, I found that all of them have additional public domain content that is currently not available as open access on JSTOR, or as of yet on other sites.  That’s because journals published in the US before 1964 had to renew their copyrights after 28 years or enter the public domain.  But most scholarly journals, including these 10, did not renew the copyrights to all their issues.  Here’s a list of the 10 journals, and their first issue copyright renewals:

  1. The American Historical Review – began 1895; issues first renewed in 1931
  2. Econometrica - began 1933; issues first renewed in 1942
  3. The American Economic Review – began 1911; issues not renewed before 1964 (when renewal became automatic)
  4. Journal of Political Economy – began 1892; issues first renewed in 1953
  5. Journal of Modern History - began 1929, issues first renewed in 1953
  6. The William and Mary Quarterly – began 1892; issues first renewed in 1946
  7. The Quarterly Journal of Economics – began 1886; issues first renewed in 1934
  8. The Mississippi Valley Historical Review (now the Journal of American History) – began 1914; issues first renewed in 1939
  9. Speculum – began 1926; issues first renewed in 1934
  10. Review of Economic Statistics (now the Review of Economics and Statistics) – began 1919; issues first renewed in 1935

This list reflects more proactive renewal policies than were typical for scholarly journals. A few years ago, I did a survey of JSTOR journals (summarized in this presentation) that were publishing between 1923 and 1950, and found that only 49 out of 298, or about 1/6, renewed any of their issue copyrights for that time period.  (JSTOR has since added more journals covering this time period, so the numbers will be different now, but I suspect the renewal rate won’t be any higher now than it was then.)

Currently JSTOR has no plans to open up access to post-1922 journal volumes.  But many of those volumes have been digitized, and are in Google’s or Hathi Trust’s collections; or they could be digitized by contributors to the Internet Archive or similar text archives.

If someone does want to open up these volumes, they should re-check their copyright status.   In particular, I have not yet checked the copyright status of individual articles in these journals, which can in theory be renewed separately.  In practice, I’ve found this rarely done for scholarly articles, but not completely unknown.  It might be feasible for me to do a “first article renewal” inventory for journals, like I’ve done for first issue renewal, which could speed up clearances.

Opportunities for open librarianship

JSTOR’s recent open access release of early journals, then, is just the beginning of the open access historic journal content that can be available online.  JSTOR provides a valuable service to libraries in providing and preserving comprehensive digital back runs of major scholarly journals, both public domain and copyrighted.  But while our libraries pay for that service, let’s also remember our mission to provide access to knowledge for all whenever possible.  JSTOR’s contribution in opening  its pre-1923 journal volumes is a much-appreciated contribution to a high-quality open record of early scholarship.  We can build on that further, with copyright research, digitization, and some basic public librarianship.  (I’ve discussed the basics of journal liberation in previous posts.)

For my part, I plan to start by gradually incorporating the open access JSTOR offerings into the serial listings of the Online Books Page, as time permits.  I can also gather further copyright information on these and other journals as I bring them in.  I’m also happy to hear about more journals that are or can go online (whether they’re JSTOR journals or not); you can submit them via my suggestion interface.

How about you?  What would you like to see from the early scholarly record, and what can you do to help open it up?

June 15, 2011

A digital public library we still need, and could build now

Filed under: citizen librarians,copyright,libraries,people,sharing — John Mark Ockerbloom @ 12:39 pm

It’s been more than half a year since the Digital Public Library of America project was formally launched, and I’m still trying to figure out what the project organizers really want it to be.  The idea of “a digital library in service of the American public” is a good one, and many existing digital libraries already play that role in a variety of ways.  As I said when I christened this blog, I’m all for creating a multitude of libraries to serve a diversity of audiences and information needs.

At a certain point after an enthusiastic band of performers says “Let’s put on a show!”, though, someone has to decide what their show’s going to be about, and start focusing effort there.  So far, the DPLA seems to be taking an opportunistic approach.  Instead of promulgating a particular blueprint for what they’ll do, they’re asking the community for suggestions, in a “beta sprint” that ends today.   Whether this results in a clear distinctive direction for the project, or a mishmash of ideas from other digitization, aggregation, preservation, and public service initiatives, remains to be seen.

Just about every digital project I’ve seen is opportunistic to some extent.   In particular, most of the big ones are opportunistic when it comes to collection development.  We go after the books, documents, and other knowledge resources that are close to hand in our physical collections, or that we find people putting on the open web, or that our users suggest, or volunteer to provide on their own.

There are a number of good reasons for this sort of opportunism.  It lets us reuse work that we don’t have to redo ourselves.  It can inform us of audience interests and needs (at least as far as the interests of the producers we find align with the interests of the consumers we serve).  And it’s cheap, and that’s nothing to sneer at when budgets are tight.

But the public libraries that my family prefers to use don’t, on the whole, have opportunistically built collections.  Rather, they have collections shaped primarily by the needs of their patrons, and not primarily by the types of materials they can easily acquire.   The “opportunistic” community and school library collections I’ve seen tend to be the underfunded ones, where books in which we have yet to land on the Moon, the Soviet Union is still around, or Alaska is not yet a state may be more visible than books that reflect current knowledge or world events.  The better libraries may still have older titles in their research stacks, but they lead with books that have current relevance to their community, and they go out of their way to acquire reliable, readable resources for whatever information needs their users have.  In other words, their collections and services are driven by  demand, not supply.

In the digital realm, we have yet to see a library that freely provides such a digital collection at large scale for American public library users.   Which is not to say we don’t have large digital book collections– the one I maintain, for instance, has over a million freely readable titles, and Google Books and lots of other smaller digital projects have millions more.  But they function more as research or special-purpose collections than as collections for general public reference, education, or enjoyment.

The big reason for this, of course, is copyright.  In the US, anyone can freely digitize books and other resources published before 1923, but providing anything published after that requires copyright research and, usually, licensing, that tends to be both complex and expensive.  So the tendency of a lot of digital library projects is to focus on the older, obviously free material, and have little current material.  But a generally useful digital public library needs to be different.

And it can be, with the right motivation, strategy, and support.  The key insight is that while a strong digital public library needs to have high-quality, current knowledge resources, it doesn’t need to have all such resources, or even the most popular or commercially successful ones.  It just needs to acquire and maintain a few high-quality resources for each of the significant needs and aptitudes of its audience. Mind you, that’s still a lot of ground to cover, especially when you consider all the ages, education levels, languages, physical and mental abilities, vocational needs, interests, and demographic backgrounds that even a midsized town’s public library serves.  But it’s still a substantially smaller problem, and involves a smaller cost, than the enticing but elusive idea of providing instant free online access to everything for everyone.

There are various ways public digital libraries could acquire suitable materials proactively.  The America.gov books collection provides one interesting example.  The US State Department wanted to create a library of easy-to-read books on civics and American culture and history for an international audience.  Some of these books were created in-house by government staff.  Others were commissioned to outside authors.  Still others were adapted from previously published works, for which the State Department acquired rights.

A public digital library could similarly create, commission, solicit, or acquire rights to books that meet unfilled information needs of its patrons.  Ideally it would aim to acquire rights not just to distribute a work as-is, but also to adapt and remix into new works, as many Creative Commons licenses allow.  This can potentially greatly increase the impact of any given work.  For instance, a compellingly written,  beautifully illustrated book on dinosaurs might be originally written for 9-12 year old English speakers, and be noticeably obsolete due to new discoveries after 5 or 10 years.  But if a library’s community has reuse and adaptation rights, library members can translate, adapt, and update the book, so it becomes useful to a larger audience over a longer period of time.

This sort of collection building can potentially be expensive; indeed, it’s sobering that America.gov has now ceased being updated, due to budget cuts.  But there’s a lot that can be produced relatively inexpensively.  Khan Academy, for example, contains thousands of short, simple educational videos, exercises, and assessments created largely by one person, with the eventual goal of systematically covering the entire standard K-12 curriculum.  While I think a good educational library will require the involvement of many more people, the Khan example shows how much one person can get accomplished with a small budget, and projects like Wikipedia show that there’s plenty of cognitive surplus to go around, that a public library effort might usefully tap into.

Moreover, the markets for rights to previously authored content can potentially be made much more efficient than they are now.  Most books, for instance, go out of print relatively quickly, with little or no commercial exploitation thereafter.  And as others have noted, just trying to get permission to use  a work digitally, even apart from any royalties, can be very expensive and time-consuming.  But new initiatives like Gluejar aim to make it easier to match up people who would be happy to share their book rights with people who want to reuse them. Authors can collect a small fee (which could easily be higher than the residual royalties on an out-of-print book); readers get to share and adapt books that are useful to them.   And that can potentially be much cheaper than acquiring the rights to a new work, or creating one from scratch.

As I’ve described above, then, a digital public library could proactively build an accessible collection of high-quality, up to date online books and other knowledge resources, by finding, soliciting, acquiring, creating, and adapting works in response to the information needs of its users.  It would build up its collection proactively and systematically, while still being opportunistic enough to spot and pursue fruitful new collection possibilities.  Such a digital library could be a very useful supplement to local public libraries, would be open any time anywhere online, and could provide more resources and accessibility options than a local public library could provide on its own.  It would require a lot of people working together to make it work, including bibliographers, public service liaisons, authors, technical developers, and volunteers, both inside and outside existing libraries.  And it would require ongoing support, like other public libraries do, though a library that successfully serves a wide audience could also potentially tap into a wide base of funds and in-kind contributions.

Whether or not the DPLA plans to do it, I think a large-scale digital free public library with a proactively-built, high-quality, broad-audience general collection is something that a civilized society can and should build.  I’d be interested in hearing if others feel the same, or have suggestions, critiques, or alternatives to offer.

May 24, 2011

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 86 other followers