Re: Online Self-Archiving: Distinguishing the Optimal from the Optional

From: Arthur Smith <apsmith_at_APS.ORG>
Date: Tue, 11 May 1999 11:30:11 -0400

Apologies to Ginsparg and Harnad if I've taken their names in vain
in my classification system. But I think there really is a
sharp distinction between the systems II and III
which Harnad dismisses: under system II (Harnadian)
the literature is clearly always free to readers, because the
journal functions are funded through author page charges, and
the pertinent information is freely available on the web somehow
or other (eg. through tags in a raw database). For system III
(Ginspargian) some, perhaps most, but probably not all, papers
are freely available through a raw archive that will inconsistently
if at all have the journal quality tags available, and journals still
charge readers for access to the full collections of papers they
accept and publish.

The distinction is really one of responsibility. Under system III
the author is responsible for the freely distributed version.
Under system II, the journal or other authoritative source takes
responsibility. Who do you trust more to get accurate information
about acceptance and publication status?

On Tue, 11 May 1999 Stevan Harnad <harnad_at_cogsci.soton.ac.uk> wrote:
> [... optimal is all refereed papers free to all ...]
>
> [... otherwise who does exactly what is optional ...]
>
> Still other steps are provisional, and conditional on the outcome of
> the above options (author self-archiving may be supplemented by official
> publisher overlays on the archive; institutional archiving may be
> subsumed by a global virtual archive; indispensable commercial
> "add-ons" might be created that everyone wants to pay for). Many
> contingencies are possible. No one knows exactly what shape the online
> literature will take eventually.

Yes - I find myself in pretty much full agreement!

> For my own part, I am prepared to make some predictions, but they are
> only predictions:
>
> [... following slightly reordered ...]
>
> (1) Readers will self-archive all their papers online (with the
        ^ - authors?
> occasional exception of those unpublished papers that some may
> sometimes wish to withhold for various reasons till they are accepted
> for publication, if they ever are; then they will be self-archived
> too).

They may or may not. Some authors will be diligent on this, others
will be lazy - there are barriers to "self-archiving", though not large.
I find it unlikely that you will ever see 100% self-archiving unless
it is somehow mandated. There will always be "heathens" left to convert...

> The overwhelming empirical evidence in support of prediction (1) from
> the Physics Archive is that authors will indeed understand and
> self-archive, once the archive is made available to them.
> See: <http://xxx.lanl.gov/cgi-bin/show_monthly_submissions>

And yet even for particle physics it is still less than 100% after
nearly eight years of evangelizing. And the growth has been linear,
not exponential.

>
> (2) Readers will overwhelmingly prefer to use the free online
> versions.
> [...]
> Again, the empirical evidence from physics is that readers will become
> quickly and overwhelmingly addicted to using the free archive.
> See: <http://xxx.lanl.gov/cgi-bin/show_weekly_graph>

But usage of our online journals (sorry I don't have handy graphs) has
grown even faster since they became available over the last 2-4 years.
For example, the OJPS system (which includes APS and AIP and some other
physics journals) received requests from 64,000 unique IP addresses
in February this year, which is double what it was last year, and is
at least very close to the unique host count that xxx receives in
a month (they quote about 30,000 unique hosts each week) - and the xxx
count hasn't changed much in the past year. Now we'll hear all
sorts of arguments about mirror sites etc. - but I think you have
to agree the numbers are at least comparable, and possibly larger
now for OJPS, and OJPS is growing faster in usage.

I don't know what xxx's red graph "number of connections" means, but
if it's the standard number of "hits" then we're way over that: xxx reports
600,000 connections a week, while OJPS reports over 2 million per week now,
and that's not counting the extra 300,000 or so per week we get for the
rest of the Physical Review journals site and PROLA. Now only 4-5 percent
of those are actual requests for full articles - but that's still a huge
number of articles delivered every week to "readers" around the world.
I have no idea what percentage of requests to xxx are full article
downloads. We have essentially the same services xxx has (freely
available abstracts too) plus some extras - full reference linking,
for example (xxx relies on the SPIRES database for reference linking
which is created by hand and only available for high energy physics papers).
We do have some additional graphics that adds to the "hit"
count, so I would guess only the full article download numbers are
directly comparable.

In any case, the empirical evidence from physics is actually not
overwhelmingly one way or the other, but it actually
seems to be heading more in the direction favoring the journals than
the eprint archives, contrary to your prediction.
>
> (3) The S/L/P market will accordingly shrink radically.
>
> For prediction (3) (S/L/P cancellation) to fail, libraries would have
> to continue to want to pay for S/L/P despite (2).

If (2) doesn't happen then (3) won't either. (2) implies (3) I agree.

>
> (4) Publishers will prefer to scale down to providing quality control
> only, funded by author-institution-end page charges.

I'm really not sure this is in any way feasible - doing it to
current standards would cost more then authors/institutions would
be willing to pay. But the arguments above mean that it probably
isn't necessary (if (3) doesn't happen, then (4) won't).

> [...]
> System I, the traditional system, is one in which the journal
> literature is available only via S/L/P tolls. Systems II and III
> are identical.

Well, I tried to argue above they are not. Under system II all
articles are necessarily freely available with all bells and whistles.
Under system III some or even most articles are freely available
with whatever the author attaches to them, but journals still
provide an S/L/P version that is complete, authoritative, and
in many ways different.

> [... barriers in II and III ...]
> Incorrect. Once every physicist is putting every published paper in the
> Physics archive, there is no barrier with III.

But the "every physicist" part is not happening, and will not happen
without some kind of enforcement which I don't see as being a good idea.
But even if every physicist did put every published paper in, does
that still make it free of barriers? Free of monetary barriers perhaps,
but there are likely to be other barriers, particularly in time to
find the right paper, and trust that it really is what you wanted to
find. Since under III the free system is controlled by authors,
all sorts of pieces could be missing or incorrect. Even with version
control, any difference from the "authoritative, published" version
is likely to make readers uncomfortable. Suppose the author gets
the published page number wrong - a volume-page based look-up would not
find the paper the reader wants, even though it is there. Suppose
the author doesn't even insert the journal publishing information - how
would a reader find the article? Suppose the same author has several
versions of the same paper of different lengths with slightly different
titles, none of which corresponds exactly with the published version?
These things happen, and are one of the things that makes the raw
archive under system III (the current system in physics) less useful
than the S/L/P journal literature.

> [... Einstein in 1905 ...]
>
> Please! Are you saying that Einstein today would not have had access to
> the Web, to self-archive in LANL?

That was NOT the question - the question was would his paper have
been refereed and prominently and authoritatively accepted under
the Harnadian page-charge system? Sure Einstein could have posted
his un-refereed un-published paper to xxx. So what? There's lots
of stuff on there now that proves Einstein wrong :-)

And I think this IS an important point - the current publishing
system makes significant allowances for the outsider while ensuring
that ridiculous work does not get prominent play - I think at least
one side of that balance is likely to be lost under the Harnadian
system.

> [...]
> Again mixing the optimal with the optional/conditional. Never mind.
> Let's self-archive and let the market decide the rest. With a free
> archive, there can't be any real losers (among authors, readers, and
> their institutions).

Sure, we can live with that. We are now!

> [... self-tagging as an approximation ...]
> And if the approximation is never close enough to put an end to all
> demand for the enhanced S/L/P version, and hence scaling down to page
> charges never becomes necessary, that is perfectly fine with me! My
> mission is not to downsize journal publishers but to free the refereed
> literature! If one can happen without the other, so be it!

Hear, hear!

> [... on some institutions not paying up ...]
>
> Arthur, are you serious? Do you think you are dealing with the bootleg
> video market here? [...]

No we're dealing with scientific researchers and institutions - who often
have to pinch their research pennies and weigh payments for one thing
against another. I'm not saying they would do this in a blatant
way, but it is much more likely to lead to fights than the current
system

> (A delinquent author is not like a delinquent subscriber!
> This is reader-end thinking again...)

No - a delinquent author is much harder to deal with. What if the
work they are trying to publish is very important?

> [...]
> If journals can scale down S/L/P so that it can survive in co-existence
> with a free archive, without the need to switch cost-recovery to
> page-charges, I will still be at peace in my grave. The free literature
> is fee enough for me.

Well good. I think the next step is for other publishers to accept the
system as it has evolved in physics. I think this is already happening
in mathematics, for example, and may soon spread to the bio-medical
fields. It'll be interesting to watch this new era of competition evolve!

                        Arthur (apsmith_at_aps.org)
Received on Wed Feb 10 1999 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:45:32 GMT