The week between Christmas and New Years is mostly time off for me– I’ve added no new listings to The Online Books Page this past week, for instance– but even on vacation, as long as I have a working Internet connection I still tend to fix bad links as I hear about them from readers’ reports.  I try to draw from a variety of free online book sources, instead of just a few big ones; that’s worthwhile to me because it increases the diversity of titles and editions on the site.  But the tradeoff is that many of these sites disappear, reorganize, or otherwise have links go bad over time.  I’m grateful to my readers for reporting bad links to me, and I can often fix other bad links to the same site when I fix the one reported to me.

The links and sites that persist, and those that don’t, often aren’t the ones you might expect.  Who’d have thought, for instance, that a shoestring-budget project that didn’t even maintain its own website until fairly recently would have the longest-lived (and still one of the largest) electronic book collections in common use, outlasting many better-funded or more systematically planned projects (as well as its own doggedly persistent original champion)?  Although the links to Project Gutenberg’s ebooks have changed over the years, the persistence of their etext numbers, and the proliferation of Gutenberg sites and mirrors, has made it relatively easy for me to keep links working for their more than 40,000 ebooks.

Some library-sponsored sites use persistent link redirection technologies, such as PURLs, to keep their links working.  But technology alone isn’t sufficient for persistence.  I recently had to update all of my links going to a PURL-based library consortium site.  I’m sure the people who worked at the organization hosting the site would have kept the links working if they could, but the organization itself was defunded by the state, and its functions were taken over by a new agency that didn’t preserve the links.

Fortunately, the failure had a couple of graceful aspects that eased recovery.  First of all, the old links didn’t stop working altogether, but redirected to the front page of a digital repository in which people could search for the titles they were looking for.  Second, the libraries in the consortium still maintained their own websites, and the old links included a serial number unique to each text (similar to Gutenberg’s etext numbers) that was also used by member libraries.  I found that in most cases I could automatically rewrite my links, using that serial number, so that they would point to a copy at a contributing library’s website.  This made it easier for me to rewrite my links, even though they go to new sites, than it’s often been for me to update links to sites that persist but reorganize.   (For instance, I’ve seen sites change to new content management systems that used completely different URLs from their old design, and then had to manually relocate and verify each link one at a time.)

Sometimes I have to replace links that still “work”, technically.  I used to have thousands of links to a Canadian consortium that provided free access to scanned public domain books and pamphlets from that country’s history.  Not long ago, I discovered that while my links still work, the site had gone to a subscription model where readers have to pay for access beyond the first dozen or so pages of each text. Given the precarious state of Canadian library funding, I’m sure the people running the site were simply doing what they thought necessary to ensure the persistence of the sponsoring organization (which continues to provide new electronic texts and services).  Personally, however, I was more concerned about the persistence of free access to the digitized texts I’d pointed to.  Fortunately, a number of the consortium’s member libraries had also uploaded copies of their scans to the Internet Archive, using the same serial numbers used on the Canadian consortium’s website.  As a result, I was able to quickly update most of my links to point to the Internet Archive’s copies.  I intend to track down working alternative links to the 200 or so remaining texts, or post requests seeking other copies of these texts, when time permits.  (I’ve also sent along a donation to the Internet Archive, in part to thank them for continuing to provide access to texts like these.)

It’s been said in digital library literature that persistence of identifiers is more a matter of policy than technology.  Based on the experiences I’ve related above, the practical persistence of links is even more a matter of will than of policy: the will (and ability) to keep maintaining access through changing conditions; the willingness to consider alternatives to specific organizational structures or policies if the original ones turn out not to be tenable; the willingness to pick things up again, or let others pick them up, after a failure.

It’s also clear from my experience that practically speaking, failure is not the main enemy of persistence.   More of a threat is not recovering from failure, or being so worried about failure that one doesn’t even begin to sustain the thing or the purpose that should persist.  To riff off a famous G. K. Chesterton quote, if it’s worth doing something, it’s worth being willing and ready to fail at doing it.  And then, to be willing to pick up again where you left off, or to make it easy for someone else to pick it up, and try something new.

That’s persistence.  That’s what’s ultimately gotten the dissertation rewritten, the estates settled, the blog picked up again, the books put and kept online for the world to read, and many other things I’ve found worthwhile, despite difficulties, anxieties, and setbacks.  I value that persistence, and I hope you value it as well, for the things you find worthwhile. I look forward to seeing where it takes us in the year to come.

About John Mark Ockerbloom

I'm a digital library architect and planner at the University of Pennsylvania.
