Self-Archiving's Why's

From: Stevan Harnad <harnad_at_COGPRINTS.SOTON.AC.UK>
Date: Thu, 19 Oct 2000 15:13:56 +0100

[These are excerpts from an interview to appear shortly; URL
to come when known.]

> Why do you feel so strongly about open archiving online?

Unlike most books and magazine articles, scholarly and scientific
research papers are written to make an impact on research and
researchers, not to earn royalty income or fees from sales of the
text. Hence fee-based access barriers (subscription, site-license,
pay-per-view [S/L/P]) are impact barriers. Researchers would prefer,
and would always have preferred, full free access to their research
reports for everyone. In the paper era, with its expenses, this was not
possible; the true costs of that means of dissemination had to be paid,
and they were high. In the on-line era it is possible to free this
special literature at last, through self-archiving by authors in
interoperable Eprint Archives. See:

> Do you feel that research is or will be conducted differently with
> the use on the Internet and these archives?

Research can only benefit from the much wider, unobstructed reach a
freed online refereed corpus will provide. Researchers will be far more
up to date and informed and research will have a much broader impact.

In addition, the online medium is much more interactive, allowing
commentaries and responses and updates to be linked to the archived
literature, both pre- and port-refereeing. Citation linking and
analysis (, linked data-sets, and enhanced
resources for online collaboration are among the other benefits of an
online digital research corpus.

> When was CogPrints: The Cognitive Sciences Eprint Archive, set up?

Two years ago. First it was a centralized, multi-disciplinary eprint
archive. Then, with the Open Archives Initiative (OAI)
(, which provided meta-data tagging
standards to ensure cross-archive interoperability, CogPrints was
upgraded this year into one of the first registered OAI-compliant
Archives ( The archive-creation software
was also generalized and made generic so OAI-compliant Eprints Archives
can now be mounted, registered and filled by any institution

> How successful is it?

The Los Alamos Physics Archive (, up since 1991, has
130,000 papers; CogPrints, in its 3rd year, has only 1,000. Something
was needed to accelerate us toward the optimal and inevitable (the
entire refereed literature online and free), and the hope is that the software will be adopted by more and more institutions to
create distributed, OAU-compliant Eprints Archives. Being
interoperable, these can all be harvested into one global "virtual"
archive, with papers searchable and retrievable by everyone, with no
need for users to know in advance which of the Eprint Archives a
particular paper actually happens to be archived in

> Your vision for the future is to have unlimited online access to all
> research articles in all disciplines for everyone. How far do you think
> this ideal has been achieved?

I think that posterity will laugh at us for taking as long as we
have been taking (,
because we could already have done it a half decade ago
( But I think we are at
last getting around to it now...

> What do you see as the major barriers to achieving these goals?

Chiefly is the sluggishness of human nature, tending to cling to
old ways even when they are no longer optimal, and easily updated.

That's the main retardant. Others include the (understandable) wish of
journal publishers to protect their current revenue streams and modera
operandi by preserving the status quo as long as possible.

There is no point waiting for publishers to scale down to what is the
optimal and inevitable solution for research: Researchers can take
matters into their own hands by self-archiving. And this can be done
legally, now, even if authors are obliged to sign the most restrictive
copyright transfer agreements

There is also still confusion (some of it inherent in the questions
being asked here) about what needs to be freed, and how, confusion
between the non-give-away literature (books) and the give-away
literature (refereed research papers), between electronic archiving and
electronic publication (vanity press), between preprints and
postprints, between copyright protection from theft-of-text (relevant
only to non-give-away authors) and copyright protection from theft of
authorship (relevant to all authors).

But by dint of tireless repetition, these confusions seem to be
dissipating now.

> What, in your opinion, can be attributed to the success of the Los
> Alamos Physics archive for unrefereed preprint literature?

Physicists set off on the road to the optimal and inevitable first.
They still haven't gotten all the way (the Los Alamos Archive is still
growing only linearly, which would still mean a decade or more before
it captured the entire refereed literature of Physics), but I hope that
a proliferation of new interoperable, institution-based Eprint Archives
will help propel the growth rate into the exponential range.

It will remain an undeniable historical fact, however, that Physicists
did it first -- not, I think, because self-archiving is more suited to
Physics in some way, or because Physics benefits more from the freeing
of its refereed literature than any other discipline: I think
Physicists did it first simply because they are smarter then the rest
of us, and more serious about their research, and hence they have much
less patience with the status quo. We can even estimate how much
smarter/faster they are by dividing the ten-year contents of the
Physics Archive by the three-year contents of the CogPrints Archive:
(130,000/10) / (1000/3) = 39 times as smart/fast...

> Why do you think that other disciplines are slow to follow the
> CogPrints and Los Alamos archives?

See above. But I think that with distributed, institution-based Eprint
Archives supplementing central ones, the momentum will now transfer
across fields -- especially with the help of pressure on researchers by
their institutions to self-archive their work to maximize its impact
(and to eventually lighten the institution's serials S/L/P burden).

> Could it be that scientists in other disciplines simply communicate
> in different ways?

Not in any relevantly different ways. All rely on their respective
refereed journal literature. No institution can afford S/L/P access to
it ALL, or even to most of it. So all researchers in all disciplines have
access to much less than they would use if they could. Moreover,
on-line access to it all is infinitely better for everyone than
on-paper access to just an affordable portion of it (on-line includes
on-paper, because you can always print-off whenever on-screen surfing
is not enough).

So there are no discipline differences whatsoever here. The reason
people ask the question has specifically to do with PREprints (i.e.,
physicists' heavy use and reliance on pre-refereeing drafts of their
papers). This is irrelevant, because what we are talking about here is
much bigger than just the preprint question: We are talking about
Eprints, which includes both pre-refereeing preprints and refereed
postprints, with the emphasis on the latter, because the latter is the
refereed literature that self-archiving is intended to liberate!

So just as it makes no difference how much of its free on-line
literature a discipline prefers to read on-screen or on-paper (the
essential thing is that it all be on-line and free), so it makes no
difference how much a discipline prefers to read its literature in
preprint or postprint form: the essential thing is again that it all be
on-line and free.

In short: No pertinent discipline-differences at all here.

> Once preprint servers are setup in other disciplines, do you think
> they will be as successful as the Los Alamos Server?

Yes, and all of them will be more successful than even Los Alamos is
now, because they will fill exponentially until the entire refereed
corpus is in there. But (to repeat) we are not talking about "preprint"
archives, but about EPRINT archives (eprints = preprints +
postprints). Moreover, we are talking about both Los-Alamos-style,
centralized, discipline-based archives and distributed,
multidisciplinary, institution-based archives ( The
essential thing is that they be OAI-compliant, hence fully

> Have you seen the Chemistry Preprint Server hosted by

Yes. All players are welcome (but they are most welcome when they
archive both preprints and postprints, and archive them all
permenently, interoperably, for free for all).

> Do you have any tips to anyone wanting to start up an archive?

Yes, go to, pick up the (free) self-archiving software,
install it at your institution, and have all researchers self-archive
all their preprints and postprints in it, now. If everyone did that
today, we would be instantly fast-forwarded to the optimal and

> How would you like to be remembered?

For the remarkable work I will be able to do once the refereed corpus
on which it draws is all on-line and freely accessible to me -- and
to all other researchers.

Stevan Harnad
Professor of Cognitive Science
Department of Electronics and phone: +44 23-80 592-582
             Computer Science fax: +44 23-80 592-865
University of Southampton
Highfield, Southampton

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98 & 99 & 00):

You may join the list at the site above.

Discussion can be posted to:
Received on Mon Jan 24 2000 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:45:53 GMT