Re: Rethinking "Collections" and Selection in the PostGutenberg Age

From: Stevan Harnad <>
Date: Mon, 12 Apr 2004 11:49:39 +0100 (BST)

    Prior topic thread:

> From: David Goodman
> If the use of self-archived papers is ultimately
> significant, then how will librarians account for such use
> when using usage statistics to judge value of a publication?

It is important to distinguish (1) the research community's problem of
access/impact loss, which is solved by providing Open Access (OA) to
each and every published journal article (either by publishing it in an
OA journal or by publishing it in a toll access (TA) and self-archiving
a *supplementary* OA version) from the (2) the library community's (and
hence, secondarily, also the research community's) problem of deciding
which serials to acquire, based on citation and usage statistics.

The merits of self-archiving should in no way be weighed by their bearing
on the the library's current methods of deciding whether a journal is used
often enough by an institution's users to warrant renewal. There *will*
be methods of measuring or estimating online usage, even for distributed,
cached and mirrored content, but what is of paramount importance and
urgency right now is to put an end to the impact/access loss by providing
OA for all journal articles. It would be extremely counterproductive
to constrain this OA provision in any way with concerns about usage
statistics for journal acquisition. Nor does it even make sense, for
the following reasons:

    (i) For OA journals that recover costs via author/institution
    publication fees, there is no library journals acquisition issue,
    as the journals are not acquired. Citation and usage statistics
    are hence mainly for the authors and their evaluators and funders,
    in deciding which OA journal to publish in.

    (ii) For OA journals that still recover costs on the TA model, but
    make their online editions OA, there are ways to estimate usage
    backwards from citations, using the usage/citation correlation data
    (Brody et al. 2004, Kurtz et al. 2003; Kurtz, 2004; Lawrence (2001):

    (iii) For TA journals, whose authors self-archive a supplementary
    OA version, it is not clear how the usage statistics for these
    self-archived versions would be used in deciding whether or not to
    subscribe to the TA version; but in any case (a) the percentage of
    any TA journal's contents that is also available OA through
    self-archiving grows anarchically and is today still small (hence
    increasing it is a far greater priority than tracking its usage for
    journal-acquisitions purposes) and (b) as it grows, and as estimates
    of the percentage become available, the estimation methods of (ii)
    can also be used to estimate usage.

I think this entire line of inquiry, however, is an example of how the
important new goal of OA provision should *not* be laid in the Procrustean
bed of serials-acquisitions concerns.

> Speaking personally, it would seem to me that with the cooperation of the
> major search engines, it would be possible to measure accesses through
> them. This would require a high degre of standardization in the metadata,
> which is not presently the case. Measuring direct internet access to known
> authors would be much harder.

Many things are possible in usage analysis, once the content is up there.
But the content is not yet up there. So let's worry about getting our eggs
laid before we worry about how we will count the chicks. Needless worries
only inhibit egg-laying!

> Again personally, I think that author self-achiving is a very good thing,
> but not as a systematic method of distribution. It may be the best we can
> do for now, in view of the continuing self-interest of many or most
> journals. But it is subject to all the risks of site stability--not to
> mention author stability.

Self-archiving is not, and does not aspire to be "a systematic method
of distribution" at this time. It is a systematic method of providing
supplementary *access* to TA journal articles (whose systematic method of
distribution remains TA journal publication) for the sake of those would-be
users webwide whose institutions cannot afford the access-tolls, hence their
potential usage and impact would otherwise be lost.

It does not help to keep mixing apples and oranges. The library community
must come to grips with the distinction between access-provision and
serials-acquisition. They are diverging, and this fact has to be understood
and taken into account.

> University based archives will be a step
> forward, with the individual linking the personal site to it. There are of
> course many experiments in process for accessing these sites in a
> systematic way; any such would be facilitate measurement.

Assessing access-provision sites and OA usage is important, useful,
and evolving. But it has almost nothing to do with serials-selection.
That is *not* why it is important to measure OA usage.

> I myself think that even in a completely open access environment, if
> papers are organized or branded as part of journals, as seems to be the
> current direction of thinking, it will still be valuable to have
> measurements of the readership as a complement to measurements of the
> citations. It is true that one of the uses, in determining which titles a
> library should spend money in purchasing, will become obsolete. But
> libraries will still need to know what journals to include in catalogs or
> journal lists that are relevant to their patrons, as well as other aspects
> of collection development. Authors will still need to determine
> appropriate journals. Publishers will still want to measure their success
> in attracting readers.

I am afraid that this too is trying to fit the future into a rather
antiquated and increasingly dysfunctional Procrustean bed. Yes OA usage
and impact will be measured and analysed, in terms of articles, authors,
institutions, journals, fields, topics, etc. But this will not have much
to do with libraries' "catalogs or journal lists that are relevant to
their patrons"!

As to "collections": They are a TA concept. They are still very much
with us as such. But whether they will perdure in the OA age, no one
can say. Let's get there first, and then we will find out.

        Brody, T., Stamerjohanns, H., Harnad, S. Gingras, Y. & Oppenheim,
        C. (2004) The effect of Open Access on Citation Impact. Presented
        at: National Policies on Open Access (OA) Provision for University
        Research Output: an International meeting, Southampton,
        19 February 2004.

        Kurtz, Michael J.; Eichhorn, Guenther; Accomazzi, Alberto; Grant,
        Carolyn S.; Demleitner, Markus; Murray, Stephen S.; Martimbeau,
        Nathalie; Elwell, Barbara. (2003) The NASA Astrophysics Data
        System: Sociology, Bibliometrics, and Impact. Journal of
        the American Society for Information Science and Technology

        Kurtz, M.J. (2004) Restrictive access policies cut readership
        of electronic research journal articles by a factor of two,
        Michael J. Kurtz, Harvard-Smithsonian Centre for Astrophysics,
        Cambridge, MA

        Lawrence, S. (2001) Online or Invisible? Nature 411 (6837): 521.

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online (1998-2004)
is available at the American Scientist Open Access Forum:
        To join the Forum:
        Post discussion to:
        Hypermail Archive:

Unified Dual Open-Access-Provision Policy:
    BOAI-2 ("gold"): Publish your article in a suitable open-access
            journal whenever one exists.
    BOAI-1 ("green"): Otherwise, publish your article in a suitable
            toll-access journal and also self-archive it.
Received on Mon Apr 12 2004 - 11:49:39 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:26 GMT