Re: The Green Road to Open Access: A Leveraged Transition

From: Eberhard R. Hilf <>
Date: Tue, 30 Mar 2004 00:13:28 +0100

Just some clarification on the posting of Subbiah Arunachalam, Leslie
Chan, and Barbara Kirsop.

Open Access to scientific documents as the solution for making articles
visible to other users and thus maximizing their reading and citation
(scientific impact) may involve various different forms of document space,
effort-levels by the authors or their institutions, and degrees of success
in achieving the aims of OA.

1. AUTHOR SELF-ARCHIVED, UNTAGGED PAPERS. The largest space of documents
is raw, untagged preprints and scientific reports self-archived by
their authors on their institution's webserver (e.g. on their research
group's website).

For Physics alone, we have already collected more than 500,000 such
documents. These are harvestable with almost zero effort by our web
crawlers. With Physnet we have achieved
already have almost complete coverage in some areas of Physics.
(Clearly, however, the crawler of such websites can only be as good as
the native language assistance offered to us: check a Finnish author!)

Since the authors often provide their publication lists on their local
websites, information on journal articles is also harvested. But because
such authors have made no effort to tag their documents with metadata,
retrieval is somewhat fuzzy -- yet almost complete for field-specific
search engines such as PhysNet (started in 1995; but not so with google,
scirus, etc.).

can be gained if the author adds some metadata, e.g., by using a web
template such as and adding
that to his document source code or by using a shadow file that points
to the document.

(See [year 2002
and earlier] In case an author finds this too tedious, we [the ISN]
will do it for your lists!)

At the moment, in physics the portion of the document space provided in
this form is under 10%. Visibility is much increased, since most search
engines understand these metadata, thus they rank them favourably. Costs
are small.

Let as call 1 and 2 "individual author self-archiving."

Institute, Department, or University Library may set up an OAI-compliant
data provider. The local document space is small; visibility is excellent
because of the systematically added metadata, the larger public visibility
of the institution as such, and easy retrievability by OAI service

This is Distributed institutional self-archiving and it increasing
exponentially currently:

Costs may be larger (because library employees' salaries are involved).

4. CENTRAL OAI-COMPLIANT SELF-ARCHIVING. Authors send their documents
to central archives, such as or the new European HAL operated by the CCSD, supported by the French central
research ageny CNRS, and under the auspices of the EPS with regard
to physics.

Costs depend on services (e.g., time stamp, permanent archiving,
retrieval) and these are paid by the server's institution. Coverage is
even smaller (ArXiv has some 269,439 documents to date) -- about 15%
of physics journal publications at to date, but this may increase
exponentially as more and more content authors convince others.

5. PEER REVIEWED JOURNAL ARCHIVES. The costs depend on the services
offered. The prices charged are known. ACP (Atmospheric Chemistry and
Physics) has pretty much automatized
the processing and covers its costs with $20 per page paid by the author,
serving the journal open-access, with features such as several levels
of refereeing, annotations, communications added by readers, etc.

The Journals doing this form of archiving may be:

(A) OPEN-ACCESS JOURNALS ('gold' journals in the Harnad and Romeo
color-code), sometimes recovering costs by charging authors. Physics has
66 gold journals by now; see

(B) TOLL-ACCESS JOURNALS ('green' journals in the Harnad and Romeo
color-code), charging the reader but endorsing individual-article
self-archiving by the author in one form or other (see the ROMEO

Now to close with some speculations about possible future scenarios for
journal publishing:

Since the refereeing normally takes place (and so it should) *after* the
preprint is self-archived on the web in one of the above ways, refereeing
can be much improved and diversified and can take its time (as it does
with ACP ).
Subscribing to a journal is thereby decoupled from gaining access
to the information content of the raw document, and money is spent only
on the refereeing and polishing, as well as the archiving of the document.

Such a process of publicly self-archiving a document first and getting
it refereed afterward would save money with which institutional libraries
could subscribe to journals and would allow those publishers to flourish
who add real value. The expert would gain the desired document directly
from the author's website or elsewhere using search engines. Small
departments in remote countries would be able to get the unrefereed
information without having to pay, but they would miss the real
added-value services.

Eberhard R. Hilf, Dr. Prof.;
CEO (Geschaeftsfuehrer)
Institute for Science Networking Oldenburg GmbH
an der Carl von Ossietzky Universitaet
Ammerlaender Heerstr.121; D-26129 Oldenburg
email :
tel : +49-441-798-2884
fax : +49-441-798-5851
Received on Tue Mar 30 2004 - 00:13:28 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:25 GMT