Forward to Libraries update, and some thoughts on sustainability and scale

It’s been a while since I posted about Forward to Libraries, but if you’ve been following my Github repo, you may have noticed that it’s had a steady stream of updates and growth. If making connections across library collections or between libraries and Wikipedia interests you, the service is more comprehensive and wide-ranging than ever. Here’s an update:

Number of libraries

We now support forwarding to over 1,000 library systems worldwide, running dozens of off-the-shelf and custom-developed catalog and discovery systems. I’ve expanded coverage in all 50 states, Canada, the UK, and Australia, and have also added links to more countries in all inhabited continents. (Because the system currently focuses on Library of Congress headings, it works best with Anglo-American catalogs, but it yields acceptable results in some other catalogs as well, like the big Norway research library union catalog I just added today.) While major research libraries and big-city public libraries are well-represented, I’ve also been trying to add HBCUs, community colleges, rural library networks, and other sometimes-overlooked communities. And I respond quickly to requests from users to add their libraries.

Wikipedia coverage

Forward to Libraries can be used to make links between library collections, and to and from my millions-strong Online Books Page listings. But it’s also known for its interoperation with Wikipedia, where it knows how to link between over half a million distinct Library of Congress name and subject headings and their corresponding English language Wikipedia articles. The majority of these mappings are name mappings provided by OCLC’s VIAF, which in its most recent data dump includes over 485,000 VIAF identifiers that include both a Library of Congress Name Authorities identifier and an English Wikipedia article link. An additional 50,000 or so topical, demographic, and geographical LC subject headings are also mapped. These mappings derive from exact matches, algorithmic matching for certain kinds of heading similarities, and a manually curated mappings file that has grown over time to include more than 22,000 correspondences.

What you can do with this data

If you’re browsing a topic or an author’s works on the Online Books Page, you can follow links to searches for the same topic or author in any of the 1000+ libraries currently in my dataset. (If your library’s not one of them, just ask me to add it.) If the topic or the author is covered in one of the 500,000 corresponding Wikipedia articles the system knows of, you’ll also be offered a link to the relevant article.

If you’re browsing certain Wikipedia articles, you’ll also find links from them back to library searches in your favorite of those 1000+ library systems (or any other of those systems you wish to search). Right now those links use templates that must be manually placed, so there are only about 2500 Wikipedia articles with those links, but any Wikipedia editor can add the templates to additional articles. (A bot could potentially add more quickly, but that would require some negotiation with the Wikipedia community that I haven’t undertaken to date.) If you’re involved in a Wikipedia-library collaboration project (like this one), you may want to add one of these templates when editing articles on topics that are likely to have relevant source materials in multiple libraries. (The most common template used is the Library Resources Box, generally added to the External Links or Further Reading section of an article.)

If you’re interested in offering a similar library or Wikipedia linking service from your own catalog or discovery system, I’d be interested in hearing from you. You can either point to my forwarding service (using a standard linking syntax) or implement your own forwarder based on my code and data on Github. Right now it requires some effort and expertise to implement either method, but I’m happy to work with interested libraries or developers to make forwarding easier to implement.

Scaling and sustaining the service

The Forward to Libraries service still runs largely as a 1-person part-time project. (The Wikipedia templates are largely placed by others, and the service fundamentally depends on regularly updated data sets from OCLC, the Library of Congress, and Wikipedia, but I maintain the code and coordinate the central referral database myself.)

Part-time, 1-person projects raise some common questions: “Will they be sustained?” and “Can they scale?” I wasn’t sure myself of the answers to those questions when I started developing this service. Fortunately, I went ahead and introduced it anyway, and three-going-on-four years later, I’m happy to say that the basic service *is* in fact more sustainable and scalable than I’d thought it might be. The code is fairly simple, and doesn’t require a lot of updating to keep running. The main scale issues for the basic service have to do with the number of library systems and the number of topic mappings in the system, and those are both manageable.

The number of libraries turns out to be the more challenging factor to manage. Libraries change their catalogs and discovery systems on a regular basis, and when they do, search links that worked for their old catalogs often don’t work for the new ones. I have an automated tool that I run occasionally to flag libraries that return an error code to my search links; it’s not very sophisticated, but it does alert me to many libraries whose profiles I need to update, without my having to check all of them manually. (If you find any libraries where forwarding is no longer working, you can also use the suggestion form to alert me to the problem.) The more pressing scaling problem at the moment is the user experience: right now, when you’re asked to choose a library, the program shows a list of links to all 1000+ libraries currently in the system. That can be a bit much to handle, especially for users whose data is metered. Updating the library choice form to only show local libraries after the user selects the state, province or country they’re in will cut down on the data sent to the user; that may cost the user an extra click, but the tradeoff seems worth it at this point.

The number of topic mappings, on the other hand, has been easier to manage than I’d thought. VIAF publishes updated data files for names about once a month, and I can run a script over it to automatically update my service’s name heading mappings when a new file comes out. Likewise, Wikipedia is now providing twice-a-month dump files of its English language encyclopedia. I can download one of Wikipedia’s files in a couple of hours, and then run a script that flags any topical articles I map to that have gone away, changed their title, or redirected to other articles. I can then change my mappings file accordingly in under an hour. Library of Congress subject headings change as well, but they don’t change very fast. New and changed topical headings are published online about once a month, and one can generally add or change mappings from one of their updates within a few hours. I spend a few spare-time hours each month adding mappings for older subject headings, so if the current rate of LCSH growth holds, in theory I could cover *all* LCSH topical headings in my manual mappings file after some number of years. In practice, I don’t have to do this, especially as topical mappings also start to get collaboratively managed in places like Wikidata. (I’m not currently working with their data set, but hope to do so in the future.)

Broadening the service

While I can maintain and gradually grow the basic service with my current resources, broadening it would require more. Not every library, particularly outside the US, uses Library of Congress Subject Headings, so it would be nice to offer mappings to more subject ontologies and languages. Similarly, not everyone likes to work with Wikipedia (often with good reason), and it’d be nice to support links to and from alternative knowledge bases as well. The basic architecture of Forward to Libraries is capable of handling multiple library and web-based ontologies, but additional coding and data maintenance would be required. There are also various ways to build on the service to engage more deeply with libraries and current topics of interest; I’ve explored some ideas along these lines, but haven’t had the time to implement them.

Things continuing as they are, though, I should be able to maintain and grow the basic Forward to Libraries service for quite some time to come. I’m thankful to the people and the data providers that have made this service possible. And if you’re interested in doing more with it, or helping develop it in new directions, I’d be very glad to talk with you.

Forward to Libraries update, and some thoughts on sustainability and scale

About John Mark Ockerbloom

Pages

Recent Posts

Recent Comments

Archives

Access for all

Copyrights and wrongs

General library-related news and comment

Interesting folks

Metadata and friends

Shiny tech

Tales from the repository

Writing and publishing

Forward to Libraries update, and some thoughts on sustainability and scale

Share this:

Related

About John Mark Ockerbloom

Pages

Recent Posts

Recent Comments

Archives

Access for all

Copyrights and wrongs

General library-related news and comment

Interesting folks

Metadata and friends

Shiny tech

Tales from the repository

Writing and publishing