Quick! While there’s still time!

Folks interested in copyright information sharing may be interested in the following draft proposal for encoding copyright evidence in MARC records, the standard format used for library catalog records.

It was published on December 17, just a couple of days ago. The email I got asked for responses by January 7.

I’m hoping to comment, when I have the time to look it over carefully. Good thing I heard about it early (though so far I’ve only seen word of it on a private mailing list; I had to Google to verify that the recommendation was actually publicly linked from somewhere).

Bad thing for folks who don’t hear about it that quick, or who are busy, especially with the holidays right in the middle of the comment period. This doesn’t appear to be unusual practice in the library world, though. For instance, the Library of Congress recently released a very interesting-looking draft report on the future of bibliographic control, and allowed all of 16 days for comments. I knew this report was coming, and I’ve been hoping to read it in detail and comment on it. But the deadline for feedback to the committee that prepared it has already passed. Well, I can always comment to the world at large here, but it would have been nice to have more time to read it over carefully and reflect on it, and then comment to the authors. This isn’t just my sentiment; K. G. Schneider has suggestions along these lines to the authors of this and other reports.

I know this can be easier said than done, as deadlines loom. In the working group I’m leading now on interfaces to integrated library systems, we ended up scrambling to get a downloadable recommendation draft together a few days before the conference where we said we’d discuss the draft. But we had most of the material that went into the draft on the wiki well before then. The next iteration of the recommendation, which will get more specific about what we need, is now in preparation, and can be followed and commented live on the site.

The face-to-face discussions of that next iteration, which will target developers and technical folks from ILS vendors and open source development efforts, are planned for February. Hopefully those that are interested in participating will have the chance to read and comment well before those face-to-face meetings.

Yes, folks are often in a rush to meet a deadline, and may be shy about showing half-finished work that may draw all kinds of criticism or premature conclusions from the audience. I’m personally vulnerable to both of those pitfalls. And as a result, we often get over-short comment periods. But it doesn’t have to be that way. Groups can plan ahead for reasonably long comment periods. And readers who are used to the blog and the wiki should know how to deal appropriately with half-finished work. As long as the work doesn’t have to be kept confidential, and there’s a reasonably transparent process for authoring and updating documents, open authoring and comment can give your group insight from a wider variety of knowledgeable people, when it’s still early enough to matter.

(PS: Any relation between the title of this post and this prank is completely, um, coincidental.)

Quick links of interest

Some resources I’ve recently hear about that look like they deserve some attention (which I haven’t given them yet, but find worth noting now)

  • The first issue of the Code4lib journal is out, with lots of interesting-looking articles on next-generation catalogs. I’m looking forward to reading the article on facet-based navigation in LCSH, since I’ve done some work with LCSH navigation, and think there’s more we can do with it than just facets. Some of the articles may also be relevant to the ILS discovery interface group I’m working with.
  • The Copyright Clearance Center is starting a Discover Works wiki that’s attempting to collect and disseminate information about copyright holders to different works. It’s a tricky thing to do, and I still have to see how well this will work as a platform, but they’ve sucked in a lot of initial data, and it will be interesting to look at it in more detail. (Thanks to Merrilee Proffitt for the tip.)
  • The University of Texas has officially launched its Free the Books blog, a place to “challenge ideas about creation and authorship and discuss copyright laws, the public domain and orphan works.” They’re planning to digitize over a million volumes at Texas, sdo this isn’t simply a theoretical concern for them.

All worth a look, in my opinion, if you have the time. I hope to get more of a look at these soon.

Kids on the lawn, and copynorms

There’s an interesting discussion over in John Scalzi’s blog about a new organization called the Organization for Transformative Works, which essentially aims to legitimize fan fiction as first-class expressions safe from copyright challenges. As I write this, there are over 200 comments in response to the opinion by Scalzi (who is a professional author, and whose take on the issue is similar to my own). The opinions run a fairly wide range, some claiming fan fiction is either inherently wrong or inherently protected, but most taking some sort of middle position.

The discussion made me think back to my childhood, growing up in a small town in Connecticut. We lived in a small, fairly new suburban development with farmland and woods around it. A lot of the houses had kids in them our age, and when we got together, we’d often go exploring or playing in yards. Other than the streets themselves, virtually all the land was privately owned. And while we’d often play in our own yards, sometimes we’d go onto someone else’s land, sometimes when the owners weren’t around. One neighbor had a woody area with some paths and little hills that we liked to ride bikes in. Another had a hill in the backyard that was great for sledding in winter. And sometimes you could play in a leftover sandpile or a fallow field in ways you couldn’t anywhere else.

I don’t remember anyone specifically asking if we could play in these places, but it was generally accepted or at least tolerated by the neighbors. There were certain rules: I knew as early as I can remember that some yards were off limits, because the owners didn’t want kids there. And we knew that we’d have to leave immediately if the owner came out and told us to, but that there would be no further consequences, assuming we hadn’t done any damage or otherwise made trouble for the owner. Some of the norms I had to learn by experience. After a scolding (and, I think, a call to my parents) following an “emergency” bathroom break, I learned that entering someone’s yard without asking was one thing, but entering someone’s house without asking was another thing entirely.

Essentially we had a vibrant neighborhood culture built on casual and tolerated trespassing. It was fundamentally a social compact, rather than a legal one. If we kids went too far in infringing on people’s properties, or the adults went too extreme in clamping down, the whole thing would have fallen apart. A homeowner brandishing a shotgun, or a kid defying a request to leave with a “we have a legitimate right to be here!”, or a parent threatening legal action if their kid was hurt in someone else’s yard, would have disrupted things pretty badly.

So I’m looking at the OTW effort with some interest and trepidation. Although I don’t have much interest in “fan fiction” as such, I recognize its value. (Especially since I met a number of friends, including my wife, in a kind of “fan fiction” venue– but that’s a story for another time,) And there’s a good argument to be made that, as long as fan writers keep their work to themselves or to a small, private circle, that it’s fair use.

But once it’s moved online into the public sphere, it seems to me the equivalent of playing in other people’s yards. You hope most people will be fine with it, and it may well even help maintain the social fabric; fan communities, after all, often end up buying lots of the original author’s books. Getting commercial with fanfic, or interfering with an author’s ability to work and make money, would be the equivalent of entering their house or building a booth on their front lawn– Not Done. And ultimately, it’s the author’s right to tell the kids to get off their lawn if they choose. Hopefully, that’s all they’ll want and need to do, and neither they nor the fans will be motivated to raise the stakes.

If you maintain a library, you might want to watch the sort of interaction going on here, even if you don’t particularly care about fanfic. Collection building and public service functions in the digital age often have to negotiate similar gray areas that aren’t neatly covered in law, but have important social aspects. It can be useful to look and see what sorts of practices build up owner and user communities, and what tears them down.

Notes (and Queries) about adopting serials

The other night, Mary was researching the authorship of a memoir of the Battle of Waterloo, originally published under the by-line “An Englishwoman”. After searching online, she found a link to an article published in an 1871 issue of Notes and Queries that looked promising. She clicked the link– and immediately hit a paywall.

Which was frustrating on multiple levels. First, our library has already bought Notes and Queries several times over. We have print copies– for most years, multiple print copies– of all the volumes from the start of the journal in 1849 up to the present. We also buy access to the online edition. But the regular online access only goes back to 1996– before that, it seems you have to buy an extra package or pay per article.

Okay. But, second, we’re dealing with an article from 1871, long since passed into the public domain. Yes, the publisher has spent money to digitize and store these old issues, and would understandably like some return for its investment. But this is the sort of resource that, with all the mass digitization now going on, should really be free online in some form.

In fact, it is, if you can find it. High-profile mass digitization projects are scanning serial volumes along with books. So far, they’re not giving serials particular attention or care, but they’re there. In order to be really useful, these serial volumes need to be consciously adopted. There are a number of ways one can do this:

  • First, one can digitize them. I’ve found at least three projects that have digitized various volumes of N&Q: the Internet Library of Early Journals (ILEJ), the Open Content Alliance (OCA), and Google. The first of these digitized systematically, but only up to 1869. The latter two don’t seem to have been as systematic, but between then they managed to digitize nearly all the later volumes up to 1922.
  • To make particular issues easily findable, though, one needs to organize them. I got worked up enough to do that for N&Q; the results are here. Except for the ILEJ range, I had to do it volume by volume; the OCA and Google collections didn’t neatly arrange them, or make it easy to find a particular volume.
  • To make them easier to use, it also helps to transform them. Project Gutenberg’s Distributed Proofreaders, for instance, has taken the digitized page images of many of the early issues and produced transcriptions that are considerably more compact, easy to search, and textually accurate than their initial scan-and-OCR digitizations.
  • Researching their copyrights may enable more journal issues to be scanned. Google is very conservative about public domain copyright determinations, particularly abroad, sometimes locking up content as far back as 1865 to some users. The OCA scanners were confident enough to go all the way to 1922. It might be possible to go further still: I’ve discovered that Notes and Queries copyright were not renewed in the US. If post-1922 volumes were subject to US renewal requirements (which requires more research, into questions like whether US-based subscriptions counted as publication here) a number of them may now be out of copyright here.
  • But why stop with uncopyrighted material? Working with the authors of articles in the serial could yield still more. Notes and Queries, like many Oxford journals, appears to have a policy allowing author self-archiving of their articles, in this case once an issue’s been out for at least 2 years. So, conceivably, motivated readers could go through the tables of contents from issues for 2005 and before, try to reach authors, and persuade (or help them) to put the articles into their institutional or disciplinary repositories, assuming they have them. (And reader intervention could help; institutional repositories tend not to fill up on their own; and many libraries can’t or won’t commit the resources to fill them themselves.)

So, here are five ways one can adopt a serial: digitizing, organizing, transforming, copyright-clearing, and getting content from authors and rightsholders. There are some interesting examples of many of these adoption strategies: consider, for instance, the Directory of Open Access Journals; and here’s a big Wiki-page organizing online pre-1930 German-language serials.) And more can be done: for instance, if enough readers supporting open access adopt various journals that allow author self-archiving, we could see lots more current research content openly findable online.

Those, then, are my notes. My queries: What further serial adoption efforts should we know about? And what should we work on?

We the mediators

Back in early 2006, Peter Brantley (now the director of the Digital Library Federation) got a lot of interesting folks in libraries and publishing together in one room to talk about issues related to reading in the digital age. While libraries and publishers have different focuses and priorities, we both serve as mediators between authors and audience, and both kinds of mediators are seeing dramatic upheavals and innovations in the ways we carry out our missions.

So the meeting touched off an interesting series of discussions. I’m having a hard time finding the “official” presentation pages from the original meeting, but here’s a short summary from me and a more detailed list of talk summaries from Tim O’Reilly. After the meeting, discussions continued on a mailing list of participants that over time added a number of other folks in publishing and libraries.

A number of the folks involved, mostly on the publishing side of things, have now started a group blog to take many of these conversations public. The blog is called Publishing Frontier, with the tagline “a raucous public discussion of the publishing revolution”. Its starting contributors include folks who’ve worked at trade publishers, scientific imprints, commercial research labs, and grassroots book digitizing.

The blog promises to be an interesting forum and chronicle of the digital revolutions in communication, largely from publishing perspectives (much as I hope this blog to be another such forum and chronicle, largely from librarianship perspectives). I encourage readers here to check it out.

Copyright information sharing: An update

I regularly get mail about the web pages I have on copyright registrations and renewals and the inventory I did on the first renewals of periodicals. Turns out a lot of folks, both inside and outside of libraries, are interested in reviving and repurposing old creative works, if they could just figure out whether they were still under copyright, and how to reach the copyright holders if they are.

Here’s part of a not atypical query I received recently (posted with permission):

My anticipated enterprise regards short fiction published predominantly in monthly war-era periodicals; the “usual” specificity of interest – ’23-’63 periodicals which might not have been timely renewed.

It appears to me, after an exhaustive study of public domain law & online resources, that your work is the current state of the art: the closest thing – right now – to definitive.

My question, then…is there, so far as you are aware, any “quantum leaps” anticipated to come down the pike in some forseeable future re: a “definitive” means to check ’23 – ’63 renewals? Especially online/searchable?

Again, THANKS beyond measure for your work; it’s the closest-to-perfect tool yet for exasperated publishers seeking to simply ascertain whether the project they’re considering is “doing the right thing” where not violating someone else’s property is concerned!

It’s both gratifying and frustrating to receive email like this: gratifying because it’s always nice to hear your my is benefiting people; frustrating because I know there’s so much more that could be done to share copyright information, especially when there are so many people interested in it.

And in fact, more is being done, and planned. I organized an open discussion at last spring’s Digital Library Federation (DLF) forum called “Sharing Copyright Information: Opportunities for Collaboration”. It was an interesting and wide-ranging conversation, involving people from a number of libraries and other organizations. Here are the notes from the session. For a good overview and background on many of the copyright issues discussed, see Stanford’s Copyright & Fair Use website.

There have been some notable developments since the spring. Carl Malamud and Peter Brantley have “liberated” recent copyright registration and renewal data from the Copyright Office’s database, making them available for analysis and indexing. Mimi Calter at Stanford has been refining and analyzing their database on book copyright renewals. Bill Carney at OCLC is planning a project for registering copyright information with WorldCat entries. You can read more about these and other initiatives in Peter Brantley’s “Checking Copyright” blog post from last month.

We’re still a long way from a one-stop shop for copyright research. But I hope to use the new data Peter and Carl have liberated to complete my inventory of periodical renewals (which now is complete only to about 1950). I’ve also heard from more than one group that would like to digitize all of the pre-1978 copyright registration and renewal records that are not in the Copyright Office’s online database. If we had good machine-readable data for new and old copyrights, we could construct powerful search engines for copyright registration research. I don’t know who’s actually going to supply this data, though, or how long it will be before it’s all available.

Of course, copyright registration searching is just one part of the problem of copyright clearance, which can involve complicated issues of provenance of works, rights, and information. I’ve recently made a presentation giving an overview of some of these questions (slides here) to an interested group of computer scientists, and a paper I wrote with more details on provenance issues in copyright research should be published later this month. (I’ll link to it when it comes out.)

I don’t want to have lots of people exerting redundant, expensive efforts to clear copyright, or to be deterred from reusing older works because clearing copyright is too difficult. It helps for those of us who are working in this area to keep each other informed about what we’re doing and finding out. So feel free to add a comment to this post if you have a question or useful information on copyright clearance. You can also email me (address in the “about” page) to suggest relevant items for future posts on copyright issues.

Book People postscript

This past Friday I closed down the Book People mailing list, a forum for people making and reading free online books that Mary and I started in 1997. Much of the activity of folks on the list would be early examples of the sort of citizen librarianship that I referred to in the first post to this blog. I announced the list’s closing about three weeks ago, giving my reasons in a later post.

In the last three weeks of the list’s activity, various listmembers wound up conversations, planned or announced various new forums, and said their goodbyes. You can read all this, and the rest of the list’s history, in the archives, which are remaining online. The most direct successor to the list is Book Futures, a Yahoo Groups mailing list maintained by Kent Larsen, and there were some other lists announced as well.

I closed the list with my own retrospection and thanks. But I continued to get some more listmember reflections even after my last post (and for all I know some more may have come in after the list’s email address was decommissioned.) Here’s one of them, a message I got from Michael Stutz (posted here with his permission):


When you started Book People back in 1997, I began a list for the discussion of what has now become known as “open content,” in an attempt to prove a concept I’d been working on in obscurity for years.

My list, Linart, shut down years ago, and that goes so far back that a whole lifetime is packed in the interim. But I do know firsthand what it’s like to administer and moderate a list like this and I know that to do it justice takes more time and work than most people would believe. I’ve never known a list with closer and more careful moderation than Book People. Absolutely every time a BP post came into my inbox, I thought of this and how keeping a good list running takes a massive amount of work.

It’s always sad to see an end, but looking back I do think that Book People had a good run and, like Linart, it reached the end of its course—a decade ago, the idea of publishing an online or electronic edition of a book was a novelty, there weren’t so many of them and they weren’t always easy to find. Not so anymore—at the very instant your announcement came into my inbox, I was downloading several gigabytes of rare old books, dozens of volumes among hundreds that I’d found through a full-text keyword search.

Just the same, Linart was a great idea because at the time no one was publishing copylefted work online—and even more importantly, _no one thought it was possible._ My main interests were books and art, but I wanted to see every kind of copyrighted work digitized online with “copyleft” licensing. And it might seem crazy now, but the reactions
from open source and free software figures to my dream went from complete disinterest to overt hostililty: “Copyleft is for software! You can’t do that with books, music, art”—replies like that were typical. Few people in the world were copylefting non-software works, but Linart is best left in the 20th century and the world as it was before Wikipedia and Creative Commons. In fact, after seeing the results of several years of online “open content” and having tested it extensively firsthand, I’m now critical of the method—I know its weaknesses and errors and have come to see that it isn’t the right solution for the age.

But what remains important today is the greater question of online publishing in general—and, of course, the future of the book. As a reader I’m nearly exclusively online for newly-published material, and as a writer that’s also where I want to find my audience, but how to do it and how it will all work out, how new writing and new books will be published and read and sold, remains entirely unclear—I’m still looking for the answer, and so I think the new Book Futures list is very aptly named and hope it takes off on its quest from this place we’ve come to after over a decade’s worth of Book People.

If anyone else from the list would like to add any postscripts or other comments here, feel free to add a comment to this post.