Why should reuse be hard?

By far the most widely cited paper with my name on it is a 1995 paper on architectural mismatch.  The journal version of the paper was subtitled “Why reuse is so hard”.  It was a paper about failure, rather than success, which most researchers prefer to write about when they’re talking about their own work.  We discussed the problems we’d encountered trying to build a new software system from existing parts, and analyzed some of the reasons for the failures, and how systems could be improved in the future to make reuse easier.

The paper was unexpectedly well received, and was recently named as one of the most influential papers to appear in IEEE Software.  (I can’t claim too much credit for this myself; my adviser David Garlan and my fellow grad student Robert Allen rightly appear ahead of me in the author credits.)  ISI Web of Knowledge, which tracks the journal version of the paper, reports it’s been cited over 100 times in other journal articles; Google Scholar, which tracks both the journal version and the conference version that was published earlier the same year, reports hundreds more citations.

Google Scholar also reports an unexpected statistic: even though the journal version of a computer science paper is generally considered more authoritative than the earlier conference version (and rightly so, in our case), the conference paper has been cited even more often than the journal version.  Why is this?  I can’t say for sure, but there’s one important difference between the two versions:  the conference paper has been freely accessible on the web for years, and the journal paper hasn’t.  It’s in a highly visible journal, mind you– pretty much anywhere with a CS department subscribes to IEEE Software, and many individual computer practitioners subscribe as well.  So I suspect that most of the authors who cited our paper could have cited the journal paper (especially since it came out only a few months after the conference paper did).  But the conference paper was that much more easily accessible, and it was the one that got the wider reuse.

We’ve recently published a followup to our paper, appearing in the July/August issue of IEEE Software.  As we note in the followup, the problem of architectural mismatch has not gone away, but several developments have made it easier to avoid.   One of them is the great proliferation of open source software that has occurred since the mid-1990s, which provide a wide selection of software components to choose from in many areas, and “a body of experience and examples that clarify which architectural assumptions and application domains go with a particular  collection of software” (to quote from our paper).

Just as the growth of open source has made software easier to reuse, the growth of open access to research can make ideas and research results easier to reuse.  We saw that with our initial paper, I think, and I hope we’ll see it again with the followup. I’ve made it available as open access, with IEEE’s blessing.  Interested folks can check it out here.

Author: John Mark Ockerbloom

I'm a digital library architect and planner at the University of Pennsylvania.