Sharing journals freely online

What are all the research journals that anyone can read freely online?  The answer is harder to determine than you might think.  Most research library catalogs can be searched for online serials (here’s what Penn Libraries gives access to, for instance), but it’s often hard for unaffiliated readers to determine what they can get access to, and what will throw up a paywall when they try following a link.

Current research

The best-known listing of current free research journals has been the Directory of Open Access Journals (DOAJ), a comprehensive listing of free-to-read research journals in all areas of scholarship. Given the ease with which anyone can throw up a web site and call it a “journal” regardless of its quality or its viability, some have worried that the directory might be a little too comprehensive to be useful.  A couple of years ago, though, DOAJ instituted more stringent criteria for what it accepts, and it recently weeded its listings of journals that did not reapply under its new criteria, or did not meet its requirements.   This week I am pleased to welcome over 8,000 of its journals to the extended-shelves listings of The Online Books Page.  The catalog entries are automatically derived from the data DOAJ provides; I’m also happy to create curated entries with more detailed cataloging on readers’ request.

Historic research

Scholarly journals go back centuries.  Many of these journals (and other periodicals) remain of interest to current scholars, whether they’re interested in the history of science and culture, the state of the natural world prior to recent environmental changes, or analyses and source documents that remain directly relevant to current scholarship.  Many older serials are also included in The Online Books Page’s extended shelves courtesy of HathiTrust, which currently offers over 130,000 serial records with at least some free-to-read content.  Many of these records are not for research journals, of course, and those that are can sometimes be fragmentary or hard to navigate.  I’m also happy to create organized, curated records for journals offered by HathiTrust and others at readers’ request.

It’s important work to organize and publicize these records, because many of these journals that go back a long way don’t make their content freely available in the first place one might look.  Recently I indexed five journals founded over a century ago that are still used enough to be included in Harvard’s 250 most popular works: Isis, The Journal of Comparative Neurology, The Journal of Infectious Diseases, The Journal of Roman Studies, and The Philosophical Review.  All five had public domain content offered at their official journal site, or JSTOR, behind paywalls (with fees for access ranging from $10 to $42 per article) that was available for free elsewhere online.  I’d much rather have readers find the free content than be stymied by a paywall.  So I’m compiling free links for these and other journals with public domain runs, whether they can be found at Hathitrust, JSTOR (which does make some early journal content, including from some of these journals, freely available), or other sites.

For many of these journals, the public domain extends as late as the 1960s due to non-renewal of copyright, so I’m also tracking when copyright renewals actually start for these journals.  I’ve done a complete inventory of serials published until 1950 that renewed their own copyrights up to 1977.  Some scholarly journals are in this list, but most are not, and many that are did not renew copyrights for many years beyond 1922.  (For the five journals mentioned above, for instance, the first copyright-renewed issues were published in 1941, 1964, 1959, 1964, and 1964 respectively– 1964 being the first year for which renewals were automatic.)

Even so, major projects like HathiTrust and JSTOR have generally stopped opening journal content at 1922, partly out of a concern for the complexity of serial copyright research.  In particular, contributions to serials could have their own copyright renewals separate from renewals for the serials themselves.  Could this keep some unrenewed serials out of the public domain?  To answer this question, I’ve also started surveying information on contribution renewals, and adding information on those renewals to my inventory.  Having recently completed this survey for all 1920s serials, I can report that so far individual contributions to scholarly journals were almost never copyright-renewed on their own.  (Individual short stories, and articles for general-interest popular magazines, often were, but not articles intended for scientific or scholarly audiences.)  I’ll post an update if the situation changes in the 1930s or later. So far, though, it’s looking like, at least for research journals, serial digitization projects can start opening issues past 1922 with little risk.  There are some review requirements, but they’re comparable in complexity to the Copyright Review Management System that HathiTrust has used to successfully open access to hundreds of thousands of post-1922 public domain book volumes.

Recent research

Let’s not forget that a lot more recent research is also available freely online, often from journal publishers themselves.  DOAJ only tracks journals that make their content open access immediately, but there are also many journals that make their content freely readable online a few months or years after initial publication.  This content can then be found in repositories like PubMedCentral (see the journals noted as “Full” in the “participation” column), publishing platforms like Highwire Press (see the journals with entries in the “free back issues” column), or individual publishers’ programs such as Elsevier’s Open Archives.

Why are publishers leaving money on the table by making old but copyrighted content freely available instead of charging for it?  Often it’s because it’s what’s makes their supporters– scholars and their funders– happy.  NIH, which runs PubMedCentral, already mandates open access to research it funds, and many of the journals that fully participate in PubMedCentral’s free issue program are largely filled with NIH-backed research.  Similarly, I suspect that the high proportion of math journals in Elsevier’s Open Archives selection has something to do with the high proportion of mathematicians in the Cost of Knowledge protest against Elsevier.  When researchers, and their affiliated organizations, make their voices heard, publishers listen.

I’m happy to include listings for  significant free runs of significant research journals on The Online Books Page as well, whether they’re open access from the get-go or after a delay.  I won’t list journals that only make the occasional paid-for article available through a “hybrid” program, or those that only have sporadic “free sample” issues.  But if a journal you value has at least a continuous year’s worth of full-sized, complete issues permanently freely available, please let me know about it and I’ll be glad to check it out.

Sharing journal information

I’m not simply trying to build up my own website, though– I want to spread this information around, so that people can easily find free research journal content wherever they go.  Right now, I have a Dublin Core OAI feed for all curated Online Books Page listings as well as a monthly dump of my raw data file, both CC0-licensed.  But I think I could do more to get free journal information to libraries and other interested parties.  I don’t have MARC records for my listings at the moment, but I suspect that holdings information– what issues of which journals are freely available, and from whom– is more useful for me to provide than bibliographic descriptions of the journals (which can already be obtained from various other sources).  Would a KBART file, published online or made available to initiatives like the Global Open Knowledgebase, be useful?  Or would something else work better to get this free journal information more widely known and used?

Issues and volumes vs. articles

Of course, many articles are made available online individually as well, as many journal publishers allow.  I don’t have the resources at this point to track articles at an individual level, but there are a growing number of other efforts that do, whether they’re proprietary but comprehensive search platforms like Google Scholar and Web of Science, disciplinary repositories like ArXiV and SSRN, institutional repositories and their aggregators like SHARE and BASE, or outright bootleg sites like Sci-Hub.  We know from them that it’s possible to index and provide access to the scholarly knowledge exchange at a global scale, but doing it accurately, openly, comprehensively, sustainably, and ethically is a bigger challenge.   I think it’s a challenge that the academic community can solve if we make it a priority.  We created the research; let’s also make it easy for the world to access it, learn from it, and put it to work.  Let’s make open access to research articles the norm, not the exception.

And as part of that, if you’d like to help me highlight and share information on free, authorized sources for online journal content, please alert me to relevant journals, make suggestions in the comments here, or get in touch with me offline.

Updates on library linking, Wikipedia, and what you can do

I’m gratified for the positive response I’ve been getting to the Forward To Libraries service I first introduced last month.  It really took off when I announced the templates for linking to libraries from Wikipedia a couple of weeks ago.   They’ve been written up in places like Boing Boing and in Wikipedia’s own Signpost newsletter.   The service now includes more than 150 libraries throughout the English-speaking world.  Various Wikipedia editors are also adding the link templates to various articles–  besides the handful I added myself, more than 450 have been added by other editors at this writing.  And I’ve heard from numerous librarians who now want to start editing Wikipedia themselves, both to add library links and to otherwise improve articles.  (Here’s how to become a Wikipedia editor.)

So far, I’ve largely provided this service on my own, with support from the University of Pennsylvania Libraries.   But I’d like to make the service more useful, and could use some help.  If you’re interested, here are some things you might want to know:

Some libraries are easier to link than others.   If you’re using one of many standard library catalogs or discovery systems, and you haven’t made substantial modifications to it, it’s easy for me to add your system. I basically just record what software you’re using and where on the Web the service runs, run some test searches to verify your system, and you’re good to go.  If you’re using a more customized, obscure, or home-grown system, I might still be able to add links to it, but it may take me more effort to figure out how to make useful search links into the system.  Any information you can provide would be helpful.  There are also certain off-the-shelf systems that I have problems with.  Many Polaris systems, for example, will give a “session timed out” message the first time you try to follow a search link into the system.   (Back up and try the link again, and everything will be fine for some time afterwards.)  Some other systems don’t seem to support deep search links in any consistent way that I’ve been able to determine, and not just some very old session-based systems, but also EBSCO’s fairly new EDS discovery platform.

I’ve determined ways to link into these various systems from reading various documentation files I’ve found on the public Internet, along with some reverse-engineering of public web sites.  If you know of better ways to link to some of these systems that I haven’t yet figured out myself, and this information can be made public, let me know.

For now, I’m declining to list libraries that don’t have many English-language subject or Library of Congress name headings, because the results of English searches in those libraries will be misleadingly incomplete.  But I’m considering ways to include translated searches, where the data to support this is available, for a wider range of countries.  (VIAF already provides much relevant data for names.)

The most popular new Wikipedia Library resource template is also controversial, and might be modified or deleted.   I provide a number of different templates for linking from Wikipedia to libraries, including the inlined text templates “Library resources about” and “Library resources by“, and the all-in-one sidebar template “Library resources box“. By far the most used of these templates has been the Library resources box.   It’s easy to spot in an article, it organizes links clearly, and it’s easy for editors to recognize as a template that they can add to articles they find of interest.  But some Wikipedians, including at least one Wikipedia admin, have objected to the template.  They cite style guidelines that say external link templates should not use boxes or other graphical elements, but only appear as inlined text.  I’ve defended the boxes, noted how other library-related external links commonly appear in boxes, and proposed ways to address various Wikipedian concerns.   But it’s ultimately up to the Wikipedia community to determine whether or how library links will appear in Wikipedia articles.  To find out more about the issues, see the Library resources box talk page.  And if you’re a Wikipedia editor or user, feel free to weigh in on that page or other relevant forums.

I’m exploring ways to make it easier for readers to get to our libraries.  For one, I’m starting to record IP ranges for some institutions, so that local network users can follow “resources in your library” links straight to the institution’s library, without having to first register a preference.  (Users can still register a different preference if they want.)  IP-based routing is an experimental service, initially being provided to a limited number of institutions, and I may modify or withdraw it in the future.  If you’d like me to consider it for your institution, you can submit a request, with the relevant IP ranges (preferably in CIDR format) in the “anything we should know?” field.  Note that the IP ranges you submit will be published as part of the library data I’m sharing for this project.

I’m starting to share my work on Github.  There is now a Github repository with selected data and code for the FTL project.  In it, you’ll find the data I use to link to the libraries enrolled in the service, and you’ll also see the code for the main CGI script used to forward readers to those libraries.   You can’t yet run the service out of the box yourself with the code and data provided so far, but I hope that what’s there will help people understand how the service works, and possibly implement similar services themselves if they’re so inclined.  The data’s released under CC0, so you can reuse it however you like; and the code is open-source licensed under the Educational Community License 2.0.  I hope to add more data and code over time, and I’m happy to hear suggestions for enhancements and improvements.

I’m hoping that as more people get involved, the service will improve, library resources will become more reachable online, and Wikipedia will become a more useful resource as well.  If you’d like to get involved yourself, I’d love to hear what you’re up to, and what suggestions you might have.