Re: Central versus institutional self-archiving

From: David Goodman <>
Date: Sun, 3 Oct 2004 17:21:19 +0100


Doesn't it depend on the institution: it particular upon the
institution's reliability, its commitment to self-archiving and OA in
general, and its general orientation towards digital access and

I will certainly agree with you that every academic and research
institution should have such responsibility and commitment; I also agree
that if an institution does have it, then it is a satisfactory place for
use self-archiving. However desirable, however urgent, this is not now
the case.

In my academic career, I have been formally associated with six
institutions. One of the six properly understands of the issues, and
should always have the relatively small funds to support it. A second
certainly has money, but its commitment to archival preservation of any
sort of faculty record, including digital, has been remarkably
irresponsible; a third has the understanding, but the nature of its
funding over the last few decades has been so problematic that nobody
could or does trust it for anything of a permanent nature; a fourth has
repeatedly demonstrated lack of understanding as well as lack of
financial support, and the other 2 are variable in these respects.

I may have been unfortunate in my sample, and so I do not name the
institutions. The proper approach is to upgrade the institutions; I
suggest that an appropriate technique is for funding agencies to require
that a university has such a capacity, not just for post-print
self-archiving, but for all the other important uses of an institutional
repository. However strong your argument for action is, it would be
reliance upon speculation to count on it as a short term development.

For people at such institutions, such as myself, the policy of some
otherwise "green" publishers leaves only personal home pages. In the
unreliable institutions, their commitment to the maintainance and
accessibility of these pages is similarly irresponsible. I am not aware
of any that has a formal commitment to maintain such pages, or even
provide for referral of their ip addresses, when a faculty member
leaves; I am not aware of any that has a policy to maintain them when a
faculty member retires or dies. (If there are enlightened exceptions, I
would be glad to know it.) Unfortunately, none of the above is
speculation--I know of instances of them all. To foresee one's eventual
decease is not speculative.

For independent scholars, for whom OA is a real boon as users, there is
no alternative at all for them as authors but personal pages maintained
upon personal or commercial servers--the extreme of instability.

While your arguments are correct for what should be the case, in the
real world the institutional basis for them is lacking, at least in the
US. (In the UK, the current proposal properly provides for the use of
the British Library as an archive for instances not covered by an
institutional archive; although I do not know the details, they may
prove sufficient.) Any reliance upon institutional archives in the US
would also need such a facility.

A similar argument could be constructed for the use of institutional
archives as a back up for centralized facilities; the NIH has proven of
superb responsibility over several decades, but it too is subject to

What do I mean by permanent? the lifetime of papers in a self-archive
must extend at least for the period of copyright, which is what permits
toll access--life plus 75 years. After this period the maintenance of
accessibility is the same problem of OA and for toll-access.


Dr. David Goodman
Associate Professor,
Palmer School of Library and Information Science
Long Island University, Brookville, NY

-----Original Message-----
From: American Scientist Open Access Forum
Behalf Of Stevan Harnad
Sent: Saturday, October 02, 2004 8:17 AM
Subject: Re: Central versus institutional self-archiving

On Fri, 1 Oct 2004, [identity deleted] wrote:

> While OAI compliance is a sine qua non condition of some measure of
> inter-operability, it does not (yet?) ensure the kind of ease of
> retrieval that other forms of archiving can provide, including some
> form of central archiving.

This is incorrect.

This erroneous view that central archiving is somehow better or safer
than distributed/institutional archiving is exactly analogous with older
views that on-paper publication is somehow better or safer than on-line
publication. The latter papyrocentric habit and illusion has happily
faded, thanks mainly to the force of the example and experience with the
growing mass of on-line content and usage. (But this obsolete thinking
did not fade before it managed to delay progress for several years; nor
has it faded entirely, yet!)

The instinctive preference for central over distributed archiving is a
remnant of that same papyrocentric thinking ("the texts are safer and
more tractable when they are all be in the same physical place") and
will likewise fade with actual experience and more technical
understanding. The trouble is that the preference (in both cases) is
invariably voiced in contexts and populations that lack both the
technical expertise and the experience with the newer, untrusted

And it always appeals to an uninformed audience that is a-priori more
receptive to what more closely resembles the old and familiar than what
resembles the new and less familiar, and that bases its sense of what is
"optimal" not on objective experiment and evidence, but on subjective

The place to voice any doubts of uncertainties on technical questions
like this is among technical experts with experience, such as the OAI
technical group, not in the wider populace that is still naive and leery
about the online medium itself, archiving, and open access.

> Let us not forget that OAI-compliance may also lead to a mixing of
> various levels of documents, for example some peer-reviewed, others
> not.

The Eprints software includes the tag "peer-reviewed" and "not peer
reviewed". This means documents can be "de-mixed" according to the
metadata tags, as intended. In addition, the journal-name tag is an
indicator. The old idea that physical location is the way to de-mix is
obsolete in the distributed online era that the Web itself so clearly

Moreover, the mixing of types of documents is a function of the
archiving policy, not of the archive-type (institutional or central) or

Lastly, the inclusion of both peer-reviewed journal articles *and* both
preprints and post-publication revisions and updates is a desirable
complement, and can likewise be handled by various forms of pre- and
post-triage using both the metadata and meta-algorithms based on
metadata and full-text (de-duplication, dating and versioning at the
harvester level).

> because of this, the perception of archives that are only
> OAI-compliant may not be entirely favorable. Scientists/scholars may
> not make much or even any use of these sources simply because they
> consider them as too "noisy" or worse.

Are we then to recommend policy not on the basis of the actual empirical
and technical facts, but on the basis of the prevailing "perception"? If
we had adopted that strategy, we would have renounced the online medium
itself a-priori, and renounced also the notion of Open Access! We are
here to promote what is in fact optimal, not what is *perceived* to be
optimal, according to existing habits and practices.

(Moreover, what is specifically at issue here is what form of
self-archiving to *mandate* -- institutional or central for erstwhile
non-self-archivers. This is an opportunity to guide and shape habits,
rather than to be held back by them.)

> Central (OAI-compliant) archiving is not mutually exclusive with
> distributed, OAI-compliant archives; it simply completes and
> reinforces the archival system that is being presently explored and
> experimented with.

That is entirely correct, and is one of the premises of OAI-compliance
interoperability: *All* forms of archiving are in fact forms of
distributed archiving and are made interoperable and equivalent by
OAI-compliance. So no one has said central archiving lacks any of the
functionality of institutional archiving.

The reason I (and others) are coming out so strongly in favour of
institutional archiving is hence not *functional*, since we fully
understand how and why both forms of archiving are functionally
equivalent (and indeed it is the advocates of central archiving that
often fail to grasp this, and argue on the basis of putative functional
advantages of central archiving that do not in fact exist).

The reason I (and others) are coming out so strongly in favour of
institutional archiving has to do with the probability of OA
content-provision itself, i.e., the probability that the OA archives
will be filled, rather than lie fallow, as well as the closely connected
probability that archive-filling will propagate across fields and
archives, rather than be restricted to just one field and archive. (It
also has a little to do with distributing the archiving burden and
costs, but that is not the primary reason.)

The authors of the annual 2.5 million articles that we would all like to
see self-archived as soon as possible are virtually all affiliated with
institutions of their own (universities or research institutions). They
also each have disciplines, but author/institution is the relevant pair
here, not author/discipline: not just because disciplines are nebulous
entities or because few disciplines have central archives and creating
and maintaining them is a much more nebulous matter, but because it is
authors and their respective institutions (not authors and their
disciplines), that share a common stake in maximizing the access to and
impact of their
(joint) research output -- not authors and their respective disciplines
(which are, if anything, a locus of competition for impact, rather than
being its joint co-beneficiaries).

Moreover, institutions (particularly universities) also share most or
all of the disciplines. So when a self-archiving policy or practice is
adopted by an institution at all, the probability is very high that it
will also propagate across all of that same institution's disciplines.
Moreover, as institutions (and not disciplines) are in competition with
one another for visibility and impact, the probability is also high that
if some institutions adopt the policy and practice of self-archiving,
this will also propagate across (competing) institutions.

In addition, institutions, being the employers of their researchers and
the co-beneficiaries of research impact, are in a position to mandate,
monitor and reward compliance with an institutional self-archiving
policy (through employment, salary, promotion, tenure, etc.).

Disciplines have neither the interest nor the wherewithal to mandate,
monitor and reward central self-archiving. Neither do Learned Societies.

The one prominent and valuable non-institution-based exception is
research-funders, whether discipline-based or national/international and
pan-disciplinary: Research-funders too have an interest in maximizing
the access to and impact of the research they fund, and are hence in a
position to mandate, monitor and reward self-archiving.

However -- and this is a critical point, particularly with the US/NIH
self-archiving mandate -- research-funders can mandate, monitor and
reward self-archiving either way: They can mandate central
self-archiving, as the current version of the US/NIH recommendation
does, or they can mandate institutional self-archiving, as the UK
recommendation does. The effect, for the specific funded research
itself, is exactly the same. The critical difference is in the
probability of propagation *beyond* the specific funded research in
question, toward the 100% OA that we are all seeking.

Having established that institutional and central archiving are
functionally equivalent, and that research-funders can equally well
mandate, monitor and reward self-archiving on either a central or an
institutional basis, the only relevant question is: Which of these
otherwise completely equivalent means is more likely to yield more OA?
And the answer is unequivocal: mandating institutional self-archiving,
according to the UK recommendation, rather than central self-archiving,
according to the US recommendation.

Yes, there is some probability that discipline-based central-archiving
mandates by research-funders will propagate across disciplines and
research-funders too (and they no doubt will). But that propagation is
just as likely (and will in fact occur far more readily of its own
accord) if each discipline and research-funder does not need to create
and fund and maintain a central archive of its own, but can distribute
that load on the institutional OAI-archiving network, which is already
growing because of the self-archiving mandates of both prior
research-funders and of institutions themselves, and is already
propagating across the disciplines within each institution, and across

(Institutional self-archiving, by the way, is actually distributed
locally too, with departments administering and monitoring compliance in
their own sectors: that is part of the beauty and functionality of
OAI-interoperability -- as well as of the modular OAI-compliant
"Institutional" self-archiving should really be called
"Institutional-Departmental" self-archiving.)

A forthcoming analysis by Rowland & Swan commissioned by the UK Joint
Information Systems Committee (JISC) has come out decisively in favor of
distributed institution-based self-archiving over central
self-archiving, for a variety of reasons, including both functional and
economic ones based on efficiency, cost and ease of implementation and
monitoring, as well as strategic reasons based on institutional research
culture and probability and ease of compliance. (I will circulate the
URL of that report as soon as it is released.)

> Consequently, it does not make much sense to focus on this issue.
> Simply let archives flourish wherever they may and in whatever form.

On the contrary, it makes a great deal of sense to focus on this issue,
to try to understand it, and to try to guide policy and implementation
in the direction that is likely to maximise the propagation of OA
self-archiving across disciplines and institutions, rather than to
minimize it:

The US central self-archiving mandate will certainly generate OA for
NIH-funded biomedical research. But why not, for the same money and
mandate, generate so much more OA, by simply dropping the stipulation
that the self-archiving must be central (in PubMed Central), and instead
let the self-archiving propagate naturally across institutions and their
disciplines? This OA maximization is attainable at no functional cost or
sacrifice whatsoever. All it requires is a small parameter change that
will confer huge benefits.

> If some institutions seem to feel more at ease with the presence of
> some centralized archives, so be it, so long as they do not object to
> the parallel development of institutional, disciplinary or even
> individual archives.

I could not follow the logic of this. (It seems to confound two senses
of the word "institution"). As far as I know, no one has spoken about
what institutions do or not feel "at ease" with (and most individual
sentiments of "ease" here are more about ease with what individuals are
accustomed to, rather than about what is actually optimal, either for
the individuals, their institutions, or OA). The issue concerns what
form of mandated OA self-archiving is likely to generate the most OA,

The concerned parties seem to be the following: (1) NIH, which is a
central (national) research-funding agency, which is also associated
(2) NLM, which has a superb and invaluable central index for abstracts
and links across all of biomedicine, PubMed, and which also has
associated with it a small but useful and growing central OA Archive,
PubMed Central (PMC).

The US Congress is considering making it law that NIH should mandate the
self-archiving of all NIH-funded research in PMC. The self-archiving
mandate for NIH-funded research is extremely desirable and welcome. The
point under discussion here is that by changing one small parameter in
the mandate -- namely, to require only that the research be
self-archived in an OAI-compliant OA archive, without stipulating that
it must be PMC
-- the very same NIH mandate can and will generate far, far more OA,
naturally propagating of its own accord across institutions and their
disciplines. The functionality will be identical (and PMC can easily and
automatically harvest all the NIH-funded institutional metadata if it
wishes, as well as to serve as a backup OAI archive for the full-texts
if an author's institutions does not yet have an OAI archive).

The UK mandate (if/when implemented) is already optimal in this regard.

Stevan Harnad

A complete Hypermail archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online (1998-2004) is
available at:
        To join or leave the Forum or change your subscription address:
        Post discussion to:

UNIVERSITIES: If you have adopted or plan to adopt an institutional
policy of providing Open Access to your own research article output,
please describe your policy at:

    BOAI-2 ("gold"): Publish your article in a suitable open-access
            journal whenever one exists.
    BOAI-1 ("green"): Otherwise, publish your article in a suitable
            toll-access journal and also self-archive it.
Received on Sun Oct 03 2004 - 17:21:19 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:36 GMT