Re: The significance of the LANL preprint server

From: Paul Ginsparg <ginsparg_at_QFWFQ.LANL.GOV>
Date: Thu, 22 Jul 1999 17:01:43 -0600

catching up on some old mail..., always a risk to send e-mail to these lists,
but i'm already so many tens of thousands of e-mail messages behind won't
see any follow-ups for months at earliest...

> Date: Sun, 11 Jul 1999 20:57:22 -0500
> From: "Ransdell, Joseph M." <ransdell_at_DOOR.NET>
> Subject: The significance of the LANL preprint server
> To: SEPTEMBER-FORUM_at_AMSCI-FORUM.AMSCI.ORG
>
> The magazine about academia called "Lingua Franca" has a page in the
> most recent on-line issue listing what it calls "The Tech 20", meaning
> people who seem to be especially significant for some reason or other in
> connection with "the digital revolution" (in the opinion of Jens David
> Ohlin, who apparently composed that page). Paul Ginsparg is among them
> and the blurb on him runs as follows:
> ...
>
> http://www.linguafranca.com/9907/tech20.html

note my only involvement in this was a few min phone call out of the blue where
i emphasized very different points, but we're always subject to the editorial
whims of the people who write these little blurbs, and that one while
slightly skewed was far from the worst...

[digression: the june nytimes article about biomed a much worse example, there
i spent much time explaining at length to reporter (Robert Pear) why these
developments might actually be positive and *a good thing* for research and
researchers, and perhaps even the american taxpayer; but all was excised in
favor of whining from various status quo parties concerned they might have to
improve the quality of their operations to maintain their slice of the pie...
plenty of subtle editorializing in their choice of subject matter, and in
general science journalists also have done a remarkably poor job, somehow
identifying with and rarely questioning any of the assumptions of the
publishers. other examples of editorializing are the usual attempt to
marginalize the archives as restricted to "high energy physics" (and usually
claiming all of whose practitioners know one another personally...),
ignoring that the high energy physics component is quite a large community
nonetheless now less than a quarter of the incoming flux.
or the usual confused comments from bloom in science (at least marked as
editorial) http://www.sciencemag.org/cgi/content/summary/285/5425/197 ,
funny how they link everything except for providing a link to the
"los alamos server" (and don't manage to get the name correct as "archives",
though the not yet existent but ever-frightening ebiomed is termed "archive").
or all the unchallenged assertions, e.g. the article in Nature, July 8, 1999,
"EMBO backs single electronic repository", with "assertion" by Frank Gannon,
executive director of EMBO (European Molecular Biology Organization):
> An unrefereed site along the lines of the Los Alamos e-print server in
> physics will not work in biology, asserts Gannon.
maybe, maybe not -- i don't know, but seems to me an experimental question,
don't biologists like experiments, what's the harm in finding out?
i.e., who stands to lose?]

> model inspiring the present movement increasingly appears to have little
> or nothing to do with its function as a preprint server even though it
> was nothing but that until recently! But now it seems that it is really
> the overlay functions which put it in the service of the refereed
> journals, plus the addition of various librarial hypertext features that
> counts. Did I have it wrong to start with in thinking the preprint
> server important? I don't think so, and what follows is intended to
> explain why.

while its initial conception was that of a preprint "bulletin board" back
in mid '91, within a month it became an "archive" when the small group of
physicist users involved at the outset opined that it would be more useful
if things remained up indefinitely (instead of the originally envisioned
three months -- that was one of the reasons the papernumbers are keyed by
year/month, so that in nov '91 there would have been an automated `rm 9108*`
had that methodology not already been rethought by sep '91...)

> I remarked earlier that the Caltech Proposal seemed not to have the Los
> Alamos system in proper perspective, as was shown in there being nothing
> in the system proposed that corresponds operationally to the preprint
> server -- which, however, did not even stimulate an attempt to explain
> the seeming discrepancy away.

http://library.caltech.edu/publications/ScholarsForum/ did say:
  IV. DOCUMENT DATABASE
  The centerpiece of this proposal is a document database that incorporates
  and builds on important features derived from Paul Ginsparg's highly
  successful physics preprint server. Begun in 1991 and today comprising
  nearly 100,000 records in physics and related disciplines, xxx.lanl.gov
  demonstrates the viability of a large electronic archive that supports
  alerting services, automated hyperlink referencing, indexing, searching,
  and archiving.

but never entirely clear why ultimately the fear of permitting
raw systematic author-to-author communication, which was always the intent
of the physics archives (note author-to-author, rather than other
non-authoring readers, was primary for research infrastructure).
the formal descriptions of these things are always difficult to follow
(ditto e-biomed) -- i'd prefer to see more attempts at on-line implementation
which would focus the logical structure more clearly. (and not sure where
exactly caltech scholar's forum or other similar initiatives now headed,
perhaps still looking for programmers...?)
i just wanted a clean system for authors to be able to communicate without
hindrances and intermediaries. i still think it can be largely automated
(would say completely if not for the current abysmal state of authoring tools,
but that too should improve eventually).

> Caltech proposal, Phelps describes the LANL system solely in terms of
> its librarial functions, saying nothing of its publication function, and
> characterizes it, moreover, as a "forum", which is an inaccurate
> description of it. (Also, I note that in the BMJ/Stanford plan for an
> archive the LANL system also seems to be implicitly regarded as a forum,
> since that term is used in their description of their archive as modeled
> on the one at Los Alamos.)

in many cases just ignorance, in others intentional spin (c.f. above on
"editorializing")

> I omit much I would like to say in this connection and will only make
> the concluding point that if clarity is not gotten on this, the present
> movement is likely to be sputtering out prematurely because of the
> contradiction involved in basing it on the LANL preprint server as the
> starting point and then abandoning that as an essential element in the
> movement because it proves to be too problematic to incorporate in the
> system being developed.

indeed this ultimately is the difference between those trying to create a
new and better form of research infrastructure, and those for lack of real
insight trying to create an electronic "clone" of the paper publishing system.

> fields or few: I have no opinion on that. The chief point of the
> present message is that it is important, in any case, to avoid
> explaining it away as something else -- such as a discussion forum or a
> bulletin board, or, as in the Lingua Franca piece, as a sort of
> preparatory zone relative to the goal of getting one's papers published
> in impeccable form. That is not what the LANL server is all about, and
> its principles can never be extended effectively to other disciplines
> with that sort of misunderstanding of it.

it is also important to note that none of the archive's users suffer from any
of the ambivalence and second guessing; they're all using it completely
oblivious to debates ongoing here and elsewhere, totally oblivious..., and
blissfully ignorant that anyone would find any of these issues controversial
(or even particularly interesting, having moved on to other things...)

> Date: Tue, 13 Jul 1999 20:55:19 +0100
> From: Stevan Harnad <harnad_at_coglit.ecs.soton.ac.uk>
> Subject: Re: Publication at LANL as involving peer review
> To: SEPTEMBER-FORUM_at_AMSCI-FORUM.AMSCI.ORG
>
> LANL (The Los Alamos Eprint Archive) does NOT just consist of unrefereed
> preprints. It started that way, but by now, in Year 8, authors are
> annually archiving both unrefereed preprints AND refereed final drafts,
> and, quite naturally, swapping the latter for the former once it is
> available.

actually they were doing that from the outset, first replacements began within
days of starting up 8 years ago, and it was always in authors' interest
to keep things up to date.

> (There are no figures yet on proportions, but I hope the stats engine
> will soon distinguish unrefereed preprints from refereed reprints: Paul?).

eventually, there's a movement here within library to cross-correlate with
publications database to fill in the gaps. no estimated timeframe yet though...

> (3) LANL has not REPLACED classical peer review, it is completely
> parasitic on it,

of course since there's no classical peer review here it hasn't REPLACED
anything, but it is not clear that it is "completely parasitic" on it,
since it is a fallacy in the extreme to imagine that physicists would decay
to the ground state of usenet newsgroups in the absence of "classical peer
review". other fields perhaps would, i have no idea, but in this one i'm
willing to assert with complete certainty (and quite intentionally without a
shred of proof, to discourage follow-up debate...).

> Date: Wed, 14 Jul 1999 15:01:59 +0100
> From: Stevan Harnad <harnad_at_coglit.ecs.soton.ac.uk>
> Subject: Re: Publication at LANL as involving peer review
> To: SEPTEMBER-FORUM_at_AMSCI-FORUM.AMSCI.ORG
>
> It is a historical fact that LANL begin as a preprint distribution
> network among 100 high energy physicists.

actually it's a historical fact that the hep-th archive based at xxx.lanl.gov
began as a preprint distribution network among 200 theorists working on
so-called matrix models of 2d quantum gravity and fluctuating surfaces with
applications to string theory. in retrospect this was an avenue of research
already playing itself out (not to be confused with the more recent incarnation
of (M)atrix models underlying M-theory underlying string theory, initiated
in '96; another avenue more or less played out, but that's the way cutting
edge research goes, win some lose some...)

> It is a further historic fact that it rapidly grew to encompass more and
> more authors and users, covered more and more of physics and beyond, and
> came to include the refereed final drafts too.

yes, except that swapping initial draft for the refereed final draft was built
in from the moment it officially became an "archive" eight years ago,
not a much later afterthought.

> So self-archiving is the "model" and the take-home message of LANL, and
> not merely, or primarily, the self-archiving of unrefereed preprints.

yes, originally "e-print" was a pun on preprint, originally appeared on a page
created by Dave Morrison at Duke in Feb 1992 for "Algebraic Geometry E-Prints",
the second archive based on my original hep-th csh scripts. (on that page,
dave credited his colleague Greg Lawler with coining the word.)
the word "e-print" then quickly devolved to meaninglessness but more recently
has been rehabilitated to mean an article either in draft or final form
SELF-ARCHIVED by the author. (at least that's what i use it to mean, but i'm
not a lawler...)

> Well, isn't it odd that today, when LANL is up to 20,000 new papers annually,

actually 20,000 is where it was in calendar '97, in calendar '99 it'll likely
be over 30,000 new papers. (and that 30,000 will be roughly a quarter of
the projected year-end overall total of 120,000 since '91, ever accelerating)

> that that alternative still does not seem to have caught on,
> and Physics journal submissions continue to proceed apace?

one also has to be very careful with claims e.g. that the archives here (or
"the high energy physics server") have no measurable effect on the
literature of and publishing trends in physics.
there is of course a context in which this statement is rigorously true,
albeit rigorously irrelevant. from some viewpoint, physics authors continue
publishing papers in journals and libraries continue to subscribe to those
journals. that there may have been a revolution in the way that researchers
actually retrieve information and communicate with one another doesn't enter
this viewpoint, and even if no physicists ever again consulted the printed
(or on-line) version of a journal in a library, there would be those who
could still argue lack of effect as long as libraries continued their
subscriptions (though the current unstable equilibrium is unlikely to persist).
this disconnect is built into the current system, the closed loop between
librarians and publishers, that sometimes regards researcher behavior or needs
as ignorable anomalies or just plain nuisances. it's not intentional, nor by
any means universal, but is a subtle bias in the way that the scholarly
publishing industry has itself developed in the post WWII period.

> Date: Tue, 13 Jul 1999 22:45:53 -0500
> From: "Ransdell, Joseph M." <ransdell_at_DOOR.NET>
> Subject: Re: Publication at LANL as involving peer review
> To: SEPTEMBER-FORUM_at_AMSCI-FORUM.AMSCI.ORG
>
> > This makes the question of whether what is going on in LANL is some new
> > form of peer review incoherent: Classical peer review is exerting its
> > FULL, usual quality control functions on the final drafts in LANL.
>
> This skips over the fact that the scientific work is being done by the
> preprints, not the final drafts.

indeed

> Date: Thu, 15 Jul 1999 12:28:39 +0100 (BST)
> From: Stevan Harnad <harnad_at_coglit.ecs.soton.ac.uk>
>To: September American Scientist Forum <SEPTEMBER-FORUM_at_AMSCI-FORUM.AMSCI.ORG>
> cc: "Harold Varmus (NIH Director)" <execsec1_at_od.nih.gov>
> Subject: Re: Publication at LANL as involving peer review
>
> Here is my empirical prediction: Eliminate the classical peer review and
> LANL will devolve into the anarchic, uncharted and un-navigable anarchy
> of Usenet's NetNews (as would any domain of human endeavour if it
> ceased to be held accountable to quality standards.)

whatever empirical predictions are, this one is unlikely to be correct.
the quality standards will be there regardless, and in many important fields
of scientific endeavor have not historically been enforced by "classical
peer review". it's silly to ignore that the community of scientists is
vastly different from the community of usenet newsgroup users, has a very
different formal training -- you might as well accuse all researchers of being
closet serial killers waiting to come out.

> (2) Peer review is not and never has been just a go/no-go "filter": It
> is an interactive, dynamic, corrective feedback process, sometimes
> proceeding through several iterative revisions and re-refereeings,

becomes a discussion of religion --
in some fields it clearly is a go/no-go "filter", or at least perceived so.
perhaps one should conduct a study to determine whether researchers who
have experienced the "interactive, dynamic, corrective feedback process"
are just a tiny minority (i personally rarely hear from them), and whether
the majority of authors (and referees) feel that the improvements mediated
by this process justify all the time and energy spent (i hear from many
people who are skeptical on this count, though i'm not claiming to hear from a
representative sample, just saying that there are perfectly vital research
fields whose practitioners have never experienced the wonders of "classical
peer review" as described above, and the quality of research in these
fields has not suffered in consequence. these are people who would find
laughable the notion that any invisible hand -- other than the perfectly
obvious need to communicate clear and accurate results to peers -- is
keeping them in line.)

but as stevan has ever-repeatedly emphasized, none of this leads to any
disagreement on how to proceed -- author/institution self-archiving is the key
and entirely independent of what will be the ultimate quality control or
improvement mechanism and how exactly it will be organized, and it can proceed
decoupled from any consideration of tampering with "classical peer review"
(regardless of where one stands in the continuum between considering
classical peer review so wonderful and perfect it needs no alteration,
or so cumbersome and imperfect it is preferably abandoned).

pg
Received on Wed Feb 10 1999 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:45:35 GMT