Everybody's Libraries

May 27, 2009

Getting bugs out of our systems

Filed under: crimes and misdemeanors,people — John Mark Ockerbloom @ 10:21 am

Very soon after we start learning to program, we start learning to deal with bugs.   Folks who have programmed for a while might forget that effective bug handling, like effective programming, is a skill that doesn’t come entirely naturally.

Many of us instinctively avoid criticism, ignore it, minimize it, or even argue against our critics.  But our programs will almost invariably include bugs, and to handle them, we have to go against the grain of our instincts.  If we’re smart, we make it as easy as possible to report bugs to us, so we minimize their impact.  We respect and listen carefully to what our clients tell us, to understand the problems they’re encountering with our product.  After we fix the bugs, we often  review our code and our practices to avoid similar problems in the future.

It helps a lot if we can keep our egos out of the bug-fixing process.  I know that my work will sometimes have bugs, and that a bug report should not be taken as a personal attack.  Rather, I try to make it an opportunity to improve my products and my future work.

Bugs exist at various levels.  Bugs that cause crashes are often the easiest to deal with: it’s clear that something is going wrong, and it usually isn’t hard to figure out what to do about it.  But less obvious bugs can be worse.  One product our library uses, for example, implemented boolean searches incorrectly, omitting important results. This kind of bug can mislead lots of people who never notice the problem.   (And it can also take longer to address.  I had to send multiple emails and examples to the developers of this product before they admitted that their implementation was buggy.)

Bugs at the overall system level can be the worst.  The reservation system with interminable holds, the customer support service that never returns our calls, the open source effort that repels key constituencies it should be attracting: all of these are buggy systems, and they can drive people away just as surely as a crashing program.  As Michael Bolton puts it, “a bug is something that bugs somebody who matters.”  System-level bugs can be challenging to fix, but they can be the most essential to repair.

I hope none of these principles seems new or controversial.  But I’ve recently seen a few bug reports concerning the Ruby on Rails community that drew many responses that ignored them.  The reports concerned buggy systems, not buggy code.  In particular, they noted a professional developer conference that attracted very few women, and an accepted presentation at that conference that included blatantly unprofessional themes, themes that one could easily predict would put off many of the people who could benefit from the talk.  (They would be particularly problematic if you were one of the few women there, but I found it distinctly off-putting as well.)

The comments on those two posts include plenty of examples of denial, minimization, rationalization, and attacking the reporters of the bugs.  (Indeed, some read as if they were cribbed right from this checklist of cliched defenses of in-group privilege.)

Assuming that the respondents are active members of the Ruby community, the responses suggest that there are still serious social bugs in that community.  I recently came back from another open source-focused conference (one that had a significantly higher proportion of women, though still far from 50%), where there were some good things said about using Ruby on Rails for library application development.  I like open source projects with good technical bases, but if I’m going to rely on a technology, I want its developer community to be healthy.  Healthy communities generally provide more reliable, long lasting development support, and can be much easier and more pleasant to work with.

It can be positively uncomfortable for many of us to confront social problems, particularly ones in our own communities that we might be partly responsible for.  (And Ruby is not the only community that’s had this kind of problem.)  Perhaps if we get used to thinking of these problems as bugs, welcoming and paying close attention to reports, and getting our egos out of the way, we’ll find it easier to fix them.

Gender inequities are bugs in our systems.  Bugs happen.  But they can be fixed.  As my library considers involvement in various community source development projects, I want to find out more about what these communities are doing, going forward, to fix and prevent these sorts of bugs.

May 16, 2009

May 15, 2009

May 8, 2009

What you’re asked to give away

Filed under: copyright,crimes and misdemeanors,open access,publishing,serials — John Mark Ockerbloom @ 9:54 pm

If you’ve published an article in an Elsevier journal, you might have missed an interesting aspect of the contract you signed with them to get published.  It goes something like this:

I grant Elsevier the exclusive right to select and reproduce any portions they choose from my research article to market drugs, medical devices, or any other commercial product, regardless of whether I approve of the product or the marketing.

What, you don’t remember agreeing to that?  Actually, the words above are mine.  But while it isn’t explicitly stated in author agreements, Elsevier authors usually grant that right implicitly. Elsevier’s typical author agreement requires you to sign over your entire copyright to them. Why ask for the whole copyright, instead of just, say, first serial rights,  and whatever else suffices for them to include the article in their journal and article databases?  Elsevier explains:

Elsevier wants to ensure that it has the exclusive distribution rights for all media. Copyright transfer eliminates any ambiguity or uncertainty about Elsevier’s ability to distribute, sub-license and protect the article from unauthorized copying or alteration.

That “unauthorized” would be “unauthorized by them”.   Not “unauthorized by you”.  Once you sign, you’ve given up the right to authorize copying or alteration, or any other rights in the copyright, except for rights they offer back to you.  For instance, you can’t “sub-license” your article for anything Elsevier deems “commercial purposes”.  But they can, and do.

And sometimes those commercial purposes have had questionable ethics.  The Scientist reported about a week ago that “Merck published [a] fake journal” with Elsevier.  (Free registration may be required to read the article.)  As they report:

Merck paid an undisclosed sum to Elsevier to produce several volumes of a publication that had the look of a peer-reviewed medical journal, but contained only reprinted or summarized articles–most of which presented data favorable to Merck products–that appeared to act solely as marketing tools with no disclosure of company sponsorship.

The publication, Australasian Journal of Bone and Joint Medicine, was published by an Elsevier subsidiary called Excerpta Medica.  As that subsidiary explains on their web site, “We partner with our clients in the pharmaceutical and biotech communities to educate the global health care community and enable them to make well-informed decisions regarding treatment options.”  In other words, they’re a PR agency for drug companies and other companies selling medical products.  Part of what they do is publish various periodicals designed to promote their clients.

Now, a number of companies publish sponsored magazines, and usually such publications clearly disclose their sponsorship, or are otherwise easily recognizable as “throwaway” commercial journals.  But this publication was designed to look more like a peer-reviewed scientific journal.   The Scientist reports this court testimony from a medical journal editor:

An “average reader” (presumably a doctor) could easily mistake the publication for a “genuine” peer reviewed medical journal, [George Jelinek] said in his testimony. “Only close inspection of the journals, along with knowledge of medical journals and publishing conventions, enabled me to determine that the Journal was not, in fact, a peer reviewed medical journal, but instead a marketing publication for MSD[A].”

Indeed, one of the publication’s “honorary editors” admitted to the Scientist that it included marketing material, but that “[i]t also had papers that were excerpted from other peer-reviewed journals. I don’t think it’s fair to say it was totally a marketing journal.”  But that was what Merck paid Elsevier for, and the excerpts from real Elsevier-acquired research articles helped the publication as a whole look like disinterested scholarship instead of advertising.

Elsevier did show some embarrassment from these revelations, particularly after widespread online outrage.  A statement posted yesterday by an Elsevier spokesman admitted the journal did not have “the appropriate disclosures”, and added

I have affirmed our business practices as they relate to what defines a journal and the proper use of disclosure language with our employees to ensure this does not happen again.

That’s certainly a step up from a previous statement quoted in the Scientist article, which, after also admitting the disclosure problems in the “journal”, simply said “Elsevier’s current disclosure policies meet the rigor and requirements of the current publishing environment,” and made no promises about what they would do in the future.

But the new statement still  leaves unanswered the question of why there are still  4 “peer reviewed journals” published under the imprint of a PR agency whose stated mission is to “support our client’s marketing objectives with strategic communications solutions in [areas that include] Medical Publishing.”  And legally, Excerpta Medica still has the right to cherry-pick from any article signed over to Elsevier in any of their marketing publications.  Or, as they announce to potential clients, “we can leverage the resources of the world’s largest medical and scientific publisher.”  Even with what Elsevier considers “proper use of disclosure language”, some authors might not want their writing used in this way.

Am I being unfair to Elsevier here?  They’re not the only academic publisher that asks its authors to sign over their copyrights.  And some of the more liberal open publication licenses, which I’ve been known to recommend, are broad enough that they too give marketers rights to reuse one’s work in their promotions.

On the first of those points, I recommend in general that authors avoid signing over their rights entirely (as I’ve managed previously), no matter who the publisher is.  But last I checked, most other academic publishers don’t also own a PR firm for commercial product marketing.  (And if any do,  they should disclose this possible use in their interactions with authors. I find no explicit disclosure of this in either Elsevier’s model agreement or on the current version of Elsevier’s author rights page.)

On the second point, if you grant an open publication license, you generally know what you’re getting into.  And you can still defend against misuse of your work in ways that you can’t do if you just sign over your copyright to a publisher.   Some open access licenses, for instance, include an attribution condition that requires any reuse of the article to credit and point to the original source, and derivation conditions that either prohibit changes or require changes to be disclosed.  (And some licenses simply prohibit commercial use altogether except by permission.)  Whatever license you choose, if a company does quote your work out of context in its marketing, and you’ve kept your own rights to reprint the article, you can publish a rebuttal as widely as you like, showing the omitted context that counters a company’s claims.  These conditions and rights can provide potent deterrents against misuse of your articles.

Often the debates over scholarly author rights and open access focus on who gets to read and use scholarly articles, and what gets paid to whom.  This episode highlights another important part of the debate: who gets the right to guard the integrity of one’s scholarship.  In the light of recent revelations, authors might want to think carefully about whether to sign that right away, and to whom.

[Updates, 9 May 2009: Some spelling corrected, and a note added that disclosure is not the only potential concern of authors whose works are used for marketing purposes.]

May 5, 2009

April 23, 2009

David Reed: Some extracts from his life and letters

Filed under: online books,people — John Mark Ockerbloom @ 11:36 pm

Last summer I was looking for a particular book. I couldn’t find it in any library in my State. Went interlibrary loans and found one copy at the library of Congress. Only one copy in the whole country. One of the best stories I ever [heard] about this is one when one of my professors was working on a trash pile of papyrus sheets and came across one that said [it] was the works of Meander. He went through that pile of papyrus with a fine tooth comb. He didn’t find anything but that single piece. He said that it felt as though he was looking across the centuries and saying, “Somewhere out there are the works of Meander.” [Friends,] this is how things get lost forever.

David Reed, 1997

Today, there are thousands of important books that will likely never share that fate as long as civilization lasts, because they were digitized and sent all over the world.  Many of these books were first put online by Project Gutenberg.  And many of the Project Gutenberg texts are online thanks to the work of David Reed.

I scanned and released Gibbon’s Decline and Fall of the Roman Empire and hardly a day goes by when I don’t get an email from someone thanking me for releasing it on the web. At one site I know that it has been downloaded 1800+ times in all six volumes.

David Reed, 2001

In the mid-1990s, Project Gutenberg had an outlandish-sounding goal: to make 10,000 books freely available online by the start of the 21st century.  They’d only managed to put a couple hundred online by then.  Authors like Clifford Stoll were skeptical that they, or anyone else, would ever reach such a goal.

But Gutenberg was soon publishing more and more texts every month, at an ever-increasing pace.   Lots of those texts had David Reed’s name on them.  Working persistently with his own scanner, well before the era of well-funded mass digitization, he digitized and proofread long works that few other people at the time would have taken on: Gibbon’s Decline and Fall; Shakespeare’s First Folio;  Josephus’ Antiquities of the Jews; Frazer’s Golden Bough; Tocqueville’s Democracy in America.  He also scanned numerous works weighty and light from authors like Rudyard Kipling, Louisa May Alcott, Robert Frost, James Joyce, and the US government.

Some critics in academia complained that the books David and others put up for Gutenberg were not up to the standards of scholarly editions.  David didn’t begrudge the work of scholars, but he wanted to put up more works, more quickly, to reach a broader audience.  As he put it in 1999:

[I] think that [it's] important to remember that we do all this work because we like to read and we like to share our discoveries with others…. I see no reason why the text specialists can’t have the specialist collections and the general people (like myself) have the general collections. There is room enough on the web for all of us. The real enemy are those who want to lock up all the books in the world. The real enemy are those who don’t read a single book.

David was fighting another enemy besides illiteracy, one closer to home. He had diabetes, and in the last few years of his life his health slowly worsened from complications of that disease. He didn’t mention it in this post (nor, as far as I can remember, in any of the posts he made to the Book People mailing list, from which these quotations are taken). But even while his health was failing, he continued to put books online, like this emergency childbirth manual that was posted this past October.  He was working to fulfill a dream that he described back in his 1999 post:

I dream of the day when we have 50,000 and 100,000 etext libraries on the web. Where there are 100 new etexts being released a week or every couple of days. When I can’t keep up with reading every etext that pops up on the Online Book Page or that Project Gutenberg releases. . I appreciate all the work that you are all doing. I love reading the work that you are all doing.

David died on April 21, 2009, according to the email his son Chris sent to David’s contacts list.  By then, Google Books and the Internet Archive’s book collection had made over 1 million books freely available online, the various Gutenberg projects had posted just over 30,000 books, and many smaller projects had posted numerous unique titles as well.  He lived long enough to see his dream come true, thanks in part to his own pioneering work and dedication.

I have dedicated etexts in honor of my daughter, my sons, my wife, parents and in honor of my companies I work for, even in honor of myself.

David Reed, 2001

Out there all over the Net, in millions of replicas, are the works of David Reed, transcribing many of the great authors that have also passed on.  In some sense, all of those works are dedicated  to him.  Through them, I hope his name lives on for generations to come.

April 9, 2009

Recent copyright news and comment (an extended mix)

Filed under: copyright,libraries — John Mark Ockerbloom @ 1:53 pm

I seem to have a certain degree of inertia over getting a blog post out, and there have been at least 4 interesting recent items related to copyright.  Since I haven’t managed to post about each individually, I’ll get over the hump by putting them all into a single post.  I hope most of my readers will find at least one of these items of interest.

1. This year marks the 100th anniversary of the passage of the Copyright Act of 1909, the first “modern” copyright law of the US.  The announcement for an April 30 conference on the Act describes its significance:

The 1909 Act was the first to protect works upon publication with notice, without prior registration; the first to expressly recognize a right to prepare derivative works; and the first to expressly recognize the public domain. The 1909 Act remained in effect for seven decades, during which time copyright law was repeatedly called upon to deal with the disruptive effect of new technologies, such as motion pictures, sound recordings, radio and television, photocopy machines, and computers. As a result, the 1909 Act had a significant influence on the copyright law we have today.

Several aspects of the law are ones I wish we still had today, like terms of more reasonable length (the maximum under the 1909 act was 56 years from publication), and the earlier expiration of copyrights into the public domain if the owner did not care enough about it to take some basic steps to maintain it (namely, including a copyright notice, and eventually registering and renewing it).  Unfortunately, as William Patry laments, the treaty structures we’re now embroiled in prevent us from returning to that regime.

If you’d like to see both the original 1909 act, and the evolution of copyright law since, David Hayes has a wonderful site where you can read the law that was in effect at various times from then until now.

2. I’ve seen more discussion online of the Google Books setttlement, and the monopoly rights it gives to Google for providing digitized copies of “unclaimed” out of print copyrighted works (or what James Grimmelmann calls “zombie works“).  It’s worth a reminder that it isn’t just Google that has a potential monopoly here; it’s also the Book Rights Registry itself.  Even if other digitizers get the rights to do what Google does, and set their own retail prices for access, the Books Rights Registry can decide on the wholesale prices and other terms, and these will obviously play a big role in determining retail prices that any provider will offer.

If the settlement agreement is upheld (as I hope it will be; I’d much rather see 1 comprehensive collection of digitized out of print books than 0), it could form the model for a future compulsory licensing scheme for such books.  Congress has enacted these before, when it’s seen a sufficient need and interest, and it’s set maximum prices for such licenses.  For instance, the current maximum license fee for recording many songs is 9.1 cents per copy, reflecting a steady but controlled rise over the last few decades.  According to this account of recent negotiations, some publishers reportedly wanted a dramatic boost to 15 cents per copy, while some digital music retailers wanted the maximum cut in half, to 4.5 cents per copy.  Congress’ Copyright Royalty Board has managed to find a middle ground that balances the interests of creators and users, while providing a way to avoid the inefficiencies of song-by-song rights negotiation.  Congress could do something similar with copyrighted but out of print books, if their constituents urged them to.  (They would have to be careful to stay within international treaty constraints, but if the licensing regime for Google falls within those constraints, then I would think that a similar regime for all set up by Congress should as well.)

3. It might eventually be possible to put many of these books online in any case, if a recent decision by a federal court in Colorado is upheld and is applied broadly enough.  In 1996, many foreign works were taken out of the public domain and put back into copyright as the result of a law passed as a result of the GATT treaties.  (I have a discussion on copyright renewals that goes into some of the details.)  Last week, a federal judge struck down this law; in the words of plaintiff attorney Larry Lessig, it “violated the First Amendment to the extent it restored copyright against parties who had relied on works in the public domain.” (Such parties are known as “reliance parties”).

I’m happy to hear about the decision, but I’m not yet ready to fire up the scanners.  For one thing, another federal circuit has already ruled the opposite way on the same issue, as William Patry noted in a 2005 blog post about a case involving Luck’s Music Library.  It remains to be seen how higher courts, or courts in other jurisdictions, will deal with these contradictory rulings.  Also, despite some claims I’ve seen online, the decision doesn’t state outright that removing works from the public domain is necessarily unconstitutional.  Rather, it says that doing so requires a higher standard of constitutional scrutiny that is not met for the case in question, in particular because Congress restricted the rights of reliance parties more stringently than international treaties actually required.

It seems possible to me that, even if the new decision survives appeal, it might simply result in an expansion of reliance party rights, and not a general right to put books online whose copyrights had been restored.  (On the other hand, the definition of “reliance party” in the law in question seems to me to include libraries and others that have simply “acquire[d].. a copy” of a restored work.) I’m not a lawyer, though, and the copyright restoration laws at issue are notoriously complicated.  I’d be interested in hearing more commentary from lawyers about the details and implications of this decision.

4. Finally, the recent publication in the New York Review of Books of a leaked, damning ICRC report on torture at Guantanamo raises some interesting copyright and ethics questions.  As David Bigwood suspects, the report is copyrighted (effective the moment it was written down) and was published without the permission of the Red Cross, which has a general policy of opposing publication of its confidential reports.  Mind you, I don’t think libraries that simply receive a print copy unsolicited (e.g. as part of of an ongoing subscription to the NYRB) should have any legal or moral qualms about keeping, preserving, and giving their patrons access to it.  But what about electronic versions, which typically involve new copies made every time someone new reads them?

There are a few approaches one could take to this question.  A fair use defense is certainly worth consideration, for instance.  The document clearly reveals many things of great public interest in the US; it’s being published online for noncommercial purposes (they’re distributing it as a free, ad-less PDF); and there’s no market for the work to be affected, since the Red Cross does not market these reports, or put them out in the public at all.  On the other hand, the document is not just quoted from, but reproduced in its entirety, and traditionally there’s been less fair use slack given for unpublished works than for published works.  But there have been past cases (such as those involving Diebold voting machine memos) where reproducing documents in full in the public interest has been upheld as fair use.  I suspect that the Red Cross will most likely not take the trouble to sue over this recent publication, but if they did, I can’t be positive about how the case would turn out.

One might argue that whether or not fair use applies, publication is justified as civil disobedience of copyright law in the service of a higher law against torture.  This approach poses some problems of its own, though, particularly under theories where those who engage in civil disobedience gladly accept the legal consequences of their actions.  Congress as of late has been steadily increasing the penalties for copyright infringement, and even the statutory and attorney’s fees, independent of any damages, are now large enough to give many people pause.

There’s also another interesting way to resolve the copyright question: A member of Congress could read the report into the Congressional Record.  By law and custom, the statements of the legislature are given immunity from most forms of legal liability, so a copy of the report in the CR, including in the online verion, should not be a legal violation of copyright, as far as I’m aware.  (Indeed, The Online Books Page already links to one other book that was read in its entirety into the CR.)  The online version there would then be readable to anyone with an Internet connection.

Reading the report into the record wouldn’t just clear up a copyright issue.  It would also put all of Congress officially on notice about the violations of American and international law by the government.  And just as we have obligations under copyright treaties to deal with copyrighted works in various ways, we also have obligations under human rights treaties to outlaw prisoner mistreatment, and investigate and prosecute those who conducted, oversaw, and covered up torture and other human rights violations, no matter how high their rank or office.

In other words, we Americans now have a test before us:  Do we take the essential rights of life and integrity of living, breathing human beings at least as seriously as we take the rights of intellectual property?  If you think we should, you might want to urge your representatives in Congress and other governmental officials to take appropriate action.

March 30, 2009

How to find complete multi-volume works in Google Books

Filed under: online books — John Mark Ockerbloom @ 10:15 pm

While Google’s agreement on copyrighted books has been the subject of much discussion lately, they’ve also been continuing to add public domain titles at a brisk pace.  For instance, they announced in February that they now had 1.5 million public domain volumes formatted for mobile devices.  And last week, they noted that they had completed their scans of hundreds of thousands of volumes of 19th century public domain books from Oxford’s Bodleian library.

If you look at the three example book links in their Oxford post, you’ll notice that each of them goes to a volume of a multi-volume edition.   Works from the nineteenth century and before were often originally published in multiple volumes, such as the “three-decker” format common for Victorian novels.  When such books are reprinted today, they’re usually printed as a single volume, but to read all of many Google titles, you’ll have to range over multiple volumes.

Unfortunately, as various readers have noted, it can be quite difficult to find readable copies of all of the volumes in a multi-volume edition.  For various reasons, they often don’t all come up when you do a search for a particular title.  This can make readers think there are no complete digital editions of a work they’re seeking, even when there are.

In working with people who have helped me fill requests for public domain books, I’ve compiled a series of techniques for finding complete multi-volume sets in Google Books.  I’d be happy to hear additional tips from readers.

  • First, do a search for full-view volumes of the work you’re looking for.  One good way to do this is to go to Google’s advanced book search page, select the “full view only” option, and enter author and title words in the appropriate blanks.
  • If you get a hit, check the start and the end of the scan, to verify which volumes are actually present. Sometimes you’ll find more than one volume in the scan, either because multiple volumes were bound together, or because Google combined volumes in its scan.
  • Go to the “about this book” page for the scan, and look in the lower regions to see if there is an “Other editions” section. This often includes links to other volumes, not just other editions. If there’s a “See more” at the bottom of such a section, click on it to see more volumes or editions.  (Sometimes Google will have multiple editions as well as multiple volumes for the same work.  It’s best when possible to compile volumes from the same edition.  You can do this by matching publishers and dates between volumes, though keep in mind that some multivolume editions came out over the course of multiple years.  Editions from different publishers, or from different times, may have inconsistent content, and might not divide into volumes at the same points.)
  • If the book is from the University of Michigan (as reported either in the “about this book” page or in the scanned front pages) check the Mirlyn catalog for the book. Sometimes this will turn up volumes scanned by Google that have been put in the Hathi Trust repository, or in Google Book Search itself, but that for some reason don’t show up in an ordinary Google books search. Some other Hathi Trust libraries also have links to digitizations of their content; see this page for details.
  • If this didn’t turn up all the volumes you’re looking for, repeat the process above for the other volumes in your initial hit list. Sometimes those will have “Other editions” links to additional volumes that didn’t appear with the earlier hits.
  • If you manage to complete a set this way, consider sharing your success with other readers.  If you fill in my book suggestion form with the volumes you find,  I can list a neatly consolidated edition of all the volumes on The Online Books Page, and help other people avoid going through all the trouble you just did.  (Give the book’s title, URL for the first volume, and other information in the appropriate blanks, and then add URLs for subsequent volumes in the “Anything else we should know?” section of the form.)
  • Even if you only partially succeeded, if it’s a work you’re particularly interested in you can use my suggestion form to let me know what you’ve been able to find.  If I can’t easily find the other volumes myself, I can at least list what was found on my works-in-progress page. With luck, someone coming along later will find or digitize the remaining volumes, and I can list the set.

Similar techniques can be used for compiling runs of historic serials, which are also present in Google, and can be of great interest to readers.

If you find these suggestions useful, I hope you’ll help me compile sets of your favorite public domain works, so we can take advantage of all this wonderful old material that Google and others are digitizing.

March 24, 2009

Gloriana St. Clair: A brief appreciation

Filed under: awards,libraries,people — John Mark Ockerbloom @ 10:50 pm
Tags:

The organizer of today’s Ada Lovelace Day, a day to celebrate women in technology, says that women need female role models they can emulate.  I’d add that men can use female role models as well.  There are at least two obvious reasons. First of all, we get a wider range of inspiration when our role models aren’t limited to half the population.  But also, people can be rather clueless about groups of people that they don’t normally see much, and that cluelessness can hold people back needlessly.

I don’t recall being consciously sexist when I entered grad school in computer science, but I wasn’t the most clueful person either.  When I noticed that there were only 4 women in our entering class of 36 (a ratio unfortunately not too far off the one I saw in undergraduate computer science), one of the first things I blurted out to one of those women was something like “gee, there’s going to be a lot of romantic competition for you four,” thinking about them more as potential dates than as fellow computer science colleagues.

I was fortunate, however, to have multiple female role models to learn from in my time as a graduate student.  Mary Shaw inspired me and many others to gain mastery over all kinds of challenges, from software engineering to bicycle trekking, through systematic and rigorous information gathering and analysis.  Jeannette Wing contributed some of the key technical  foundations to my own dissertation work (in her work with Barbara Liskov on type substitutability), and as a member of my dissertation committee repeatedly challenged me to write more clearly and logically, helping ensure that my ideas were sound and understandable.  And I don’t have the space here to enumerate, or express full thanks for, what I’ve learned from Mary Mark since I met her.

I also found another role model who I’d like to talk about today: Gloriana St. Clair, dean of libraries at Carnegie Mellon.  Unlike the other women I’ve mentioned above, she has no “technology” degree, but she’s played very important roles in bringing together technology and librarianship, to the benefit of both.

She has long emphasized the importance of digital technology to the future of libraries, featuring it prominently in strategic plans and library organization, and cultivating people with the skills and knowledge to design and improve the digital library.  (And not just her own staff; she encouraged me, while still in the computer science department, to get out to library conferences to find out more what people were doing and thinking, and even gave me a ride to a CNI forum in Washington, DC.)

She’s also helped educate technologists about the important roles that libraries and librarianship play in managing information.  While at Carnegie Mellon, I got involved in a computer science-led project to build a massive digital book collection, where much of the early thinking seemed to assume that the problem was largely a matter of committing enough technology and funding.  I was very happy to see Gloriana get involved and show how sound librarianship could make that project, as well as other digital library initiatives I’d dabbled in previously, much more effective, usable, and preservable than a purely engineering-oriented project would have been.

She’s also been unafraid to take a leap into a new area or initiative when called for.  Not content to settle for an MLS degree as a librarian, she went on to get a PhD in literature, and an MBA that she’s used both to help manage libraries and teach others about library management.  And when a commercial publisher bought the library science journal she edited and raised its prices, she organized a mass exodus of editors to a new, lower-cost journal founded under the auspices of SPARC.

My own career jump, from a computer science department at Carnegie Mellon in Pittsburgh to a library at the University of Pennsylvania in Philadelphia, was made easier in a number of ways by her help and example.  Indeed, after I got more acclimated to library culture, I  gained a better appreciation of how well she accomplished one of the great functions of librarians: to build bridges and spread knowledge among a variety of different disciplines.  I also grew to appreciate the importance of building such bridges for librarianship itself.  Librarians can be another set of folks that many faculty and professionals don’t see much of, and that they can be correspondingly clueless about.  If libraries and their users are not to be held back needlessly, we need to build better bridges between each other.

I’m not alone in my appreciation for Gloriana.  Just a few weeks ago, the Association of College and Research Libraries named her Academic/Research Librarian of the Year.  To their award, I’d like to add my own personal and professional thanks.  And thanks as well to the many other women in technology who have, knowingly or not, given me knowledge, inspiration, encouragement, and some helpful clues.  I hope I can make a suitable contribution in turn.

March 18, 2009

« Previous PageNext Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.