Re: Central versus institutional self-archiving

From: Stevan Harnad <>
Date: Thu, 21 Sep 2006 16:31:24 +0100

Pertinent Prior AmSci Topic Threads:

    "Central vs. Distributed Archives" (Jun 1999)

    "PubMed and self-archiving" (Aug 2003)

    "Central versus institutional self-archiving" (Nov 2003)

Let me try to explain why unreflective support for PubMed Central (PMC,
and US PMC) *as the locus for direct self-archiving by authors* is very
unfortunate for Institutional Repositories (IRs), for self-archiving,
and for Open Access (OA) progress in general. The reason is very simple,
and I very much hope that it will be given some thought by the many who
are currently pushing unquestioningly for central self-archiving. (Please
note that this has nothing to do with the existence and value of PMC:
only with whether or not it should be authors' primary locus of deposit
when self-archiving their papers, or for institutions and funders, when
mandating that authors self-archive their papers.)

(1) PMC and UK PMC Central are grounded in two things, (i) the pre-OAI
and pre-IR central-archiving model originating from the early and very
successful Physics Arxiv and (ii) Harold Varmus's -- and hence NIH's,
PLoS's, the Wellcome Trust's and now the UK MRC's fixation on the central
(indeed the PMC) model of OA self-archiving. That self-archiving model
is already obsolete in the OAI era of interoperable OAI-compliant IRs.

(2) Although they appear to be complementary -- after all, OAI
renders all OAI-compliant archives, whether central or institutional,
interoperable, and hence equivalent -- in reality, at this critical point
in the evolution of OA self-archiving policy-making, (a) institutional
self-archiving and (b) central self-archiving are profoundly at odds with
one another in the quest for a systematic, universal self-archiving policy
solution that will systematically scale up to cover all research output,
from all institutions, in all disciplines, worldwide.

(3) In the OAI-interoperable age, the natural and optimal solution is for
researchers to self-archive their own papers in their own OAI-compliant
Institutional Repositories (IRs) and for whatever central archives one
may wish to have -- whether subject-based or funder-based or national --
to be *harvested*, via the OAI protocol for metadata harvesting, from
the distributed local IRs, rather than deposited, (or re-deposited)
directly. That is what the OAI metadata-harvesting protocol was created

(4) So although on the surface it looks as if there is room for
complementarity, pluralism, and parallelism between (let us call them)
CRs (central repositories) and IRs (institutional Repositories), the
question of what their optimal interrelationship should be is far
more complicated insofar as formulating a systematic, effective OA
self-archiving policy is concerned, to ensure that it will scale up
to cover all of OA space. There is a profound and important strategic
conflict specifically related to institutional and research-funder
self-archiving policy (mandates).

(5) Dr. Alma Swan has published key papers on both the subject of
OA self-archiving policy and the subject of institutional versus central
self-archiving (IRs vs. CRs).

(6) The gist of the strategic and practical conflict between IRs and CRs,
as well as the basis its resolution, is the following:

(7) Universities (and other research institutions) are the *primary
research providers*. It is their researchers who conduct and publish
the research. It is they and their researchers who are in a position
to provide OA. It is they and their researchers who co-benefit from
providing OA by self-archiving their own research output. The natural
place for them to self-archive their own research output is in their
own respective (OAI-compliant) IRs. This covers all the output of all
their disciplines (some research institutions have just one research
speciality, whereas others, including all universities, cover most or
all research specialties).

(8) Universities (and other research institutions) are real entities,
with their own institutional identity, and it is their own
institutional visibility and productivity and research impact (along
with the impact and progress of research in general) that they are
motivated and indeed necessitated to promote and foster. CRs do not
correspond to institutional entities with needs of their own. (The
partial exception is when a CR is funder-based, where the funder is
an entity with interests. I will return to this.)

(9) Universities (and other research institutions) are also the ones
that are in the strongest position to mandate the self-archiving of their
own research input, as well as to monitor and to reward compliance with
their self-archiving policy. (Again, the only exception is a funder,
or a national government.)

(10) Universities (and other research institutions) are helped in their
efforts to mandate OA self-archiving by OA self-archiving mandates from
the funders of their research, but (a) not all their research is funded,
(b) it would be extremely awkward and inefficient to have a different
external cross-institution CR as the locus of primary deposit for
every funder and every subject and every other possible collection of
combination of subjects (and nations!) by a single institutions' authors.

(11) The natural and efficient way to create CRs -- whether funder CRs
or subject-based CRs or multidisciplinary CRs or national CRs -- is to
selectively harvest their contents from the individual, distributed
IRs of the researchers' own institutions.

(12) IRs are also the most natural and efficient and systematic and
universal way to scale up to cover all of OA space -- originating
from all disciplines, at all institutions, in all nations.

(13) A few generic OAI-compliant CRs are fine for provisionally or even
permanently depositing research by researchers whose institutions do not
yet have an IR (or by researchers who do not have an institution!); but
apart from that, direct deposit in CRs is extremely counterproductive at
a time when self-archiving has not yet been established as a systematic
research imperative.

(14) The optimal thing for both research institutions *and* funders
to do now is to mandate self-archiving in the researcher's own IR
(except where a default generic CR is needed because the researcher's
institution does not yet have an IR).

(15) Compliance can be monitored and rewarded, primarily by the
researcher's own institution, but also through the grant-fulfilment
conditions of the funder.

(16) This will systematically scale up to cover all disciplines, at
all institutions, globally.

(17) Instead mandating central self-archiving (e.g., in PMC)
simply creates an unsystematic and incoherent policy that
does not translate into a general means of covering all
research output of all research institutions.

(18) The NIH, Wellcome Trust and MRC self-archiving policies (though
make important contributions to OA) are hence complicating and retarding
progress toward a universal, systematic solution toward making all
institutions' research output OA because of their insistence on direct
deposit in PMC.

(19) What the NIH, Wellcome Trust and MRC should be mandating is not
the arbitrary direct depositing in PMC, but universal depositing in the
fundee's own IR, from which PMC (and any other CRs) can then harvest
collections, if they wish.

(20) In this way, institutional and funder self-archiving mandates
can be synergistic instead of antagonistic (confusing researchers
about where to self-archive, arousing resentment about the need to do
multiple deposits; failing to generalize and scale up to a systematic,
universal self-archiving policy and solution, for all institutions,
disciplines, funders and nations, and in general retarding instead of
accelerating progress in the formulation of effective and compatible
self-archiving policies globally).

(21) The last point is that not only is primary depositing in CRs a
very bad idea, but in the OAI-age CRs need not contain the full-texts
at all: they are really just "virtual archives" in much the way that
google or OAIster is: They harvest the metadata and links, allow
focussed search, and then point back to the IRs for accessing the
full-text itself. The notion of having to have one central "place" to
put all papers is obsolete in the OAI age. (I am not referring to
redundancy and preservation issues, for which some duplication is
useful and indeed necessary; I am referring to the fallacious notion
that we need CRs in order to have the target content for searching
and accessing "all in one place." We do not; and we should not.)

Many well-meaning advocates of OA do not yet understand any of this,
imagining that CRs like PMC will in some mysterious way manage to
cover all of OA space. I hope the summary above will help to redirect
the welcome and important contributions of the supporters of the
NIH-PLoS-Wellcome-MRC OA initiatives in a direction that is more helpful
for scaling up to cover the world's research output as a whole.

Stevan Harnad
Received on Thu Sep 21 2006 - 17:04:48 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:31 GMT