Re: The True Cost of the Essentials (Implementing Peer Review)

From: Mark Doyle <doyle_at_APS.ORG>
Date: Mon, 1 Apr 2002 13:40:46 -0500

Greetings David,

On Friday, March 29, 2002, at 05:25 PM, David Goodman wrote:

> Mark, In what respect are PDF and especially TeX archives flawed?

A uniform TeX archive built upon high level macros providing tagged
information
might be a good archival format well into the future. However,
the key concepts here are "uniform" and "macros". arXiv.org is very far
from uniform
and this makes the collection rather unwieldy. I have no doubt that TeX
the program
is maintainable into the future. However, migration to new formats that
take advantage of new features can be hampered by the limitation of
what the author submitted. Thus, TeX by authors using a good macro
package can produce internally hyperlinked PDF files, but authors who
don't use the packages don't get to migrate to the new features. An
XML format wouldn't have this problem.

Searching and linking are much more robustly done when the source
is marked up. TeX can be used this way. In fact, under my direction at
the
APS we have developed REVTeX 4 which, if used, correctly, provides almost
all of the tagging needed to extract a fully tagged XML file. However,
authors
still need to apply it correctly and many do not.

As for PDF, it really depends on how it was created. Word-generated PDF
files
can have font problems on various platforms and newer versions of
Acrobat Reader
sometimes create problems for existing files causing characters to
be dropped. We have had to redistill PDF files that were created with
particular
versions of the Adobe distiller for our own journals because it there
was a flaw
that made them render incorrectly in some PDF viewers. Other users
occasionally
report problems when they print the documents (missing characters, blank
pages, etc.). Good PDF from TeX can be difficult to produce and the naive
approach (dvips -> distiller in the default configuration) produces awful
results.

PDF is also not marked up for reliably extracting information
that could be used for linking articles. Adobe has added new features to
PDF to help here, but again, the authoring tool or someone has to add the
markup. PDF presumes a particular final formatting aspect. It can't be
reworked to be displayed to take advantage of a new technology (we are
talking 20 years out).

Bottom line is that PDF is pretty much a proprietary format that must be
tested in many viewers and even then, there are many subtleties to
producing
good PDFs.

> The only thing I could find from your posting that they were deficient
> in
> is the provision of links. But this can be incorporated into the
> preparation of text, especially if all the documents are on OAI
> repositories.

Right, but my point is that only publishers (or librarians in the case of
SLAC/SPIRES for instance) incorporate them. Much of this can be automated
but there is a labor component for the hard parts. But I also have in
mind
features like "find all papers that cite M. Doyle".

> The other part that might be missing is an organization that will
> permanently stand behind the repository. I do not think anyone regards
> commercial publishers as sufficient, and in response they are beginning
> to
> make arrangements with more durable organizations. Societies might be
> sufficient, if they are strong societies like yours'. But surely you
> could
> just as well adopt the responsibility of maintaining ArXiv as you
> accept
> the responsibility of maintaining your current journals.

Yes, but we do have to pay for it. I would rather see a partnership
between
APS and libraries to maintain the archive. The we could externalize the
cost (rather than internalize more costs making it harder to move away
from the subscription model).

> I consider publishers' platforms universally a
> nuisance, and so do our users. Their use is increasing, because
> publishers do their best to direct users there as a form of
> self-advertising. If a user has a reference, the user wants to go
> to it, or at least the journal, not the publisher's home page.

Hmm, almost all major publishers are in CrossRef and this almost always
goes to a wrapper page with at least the abstract and a link to the full
text.
I wouldn't call that "self-advertising" (that is really cynical). There
are good
reasons to go to a wrapper page as well (even if the articles themselves
are free as in arXiv.org) - the articles may be available in
different formats or there may be useful features such as linking or
citing
articles or pointers to errata and other related papers that the user
should
be aware of. APS interfaces are decidedly bare bones and users usually
compliment them for being so. Anyway, from the APS point of view we
are taking users to the version we certify as being peer reviewed. We
wish
it could be free to all, but we don't have the economic model to
accommodate
this at the moment.

> The various
> features for personalization are of limited value when they are linked
> to
> a single publisher. They might be of great value if they offered
> universal coverage, and the APS could well provide this service for its
> member completely independently of publishing journals.

Perhaps, but you see this is precisely the kind of feature which is an
added
expense and at this time I would have to side with Stevan and say that
this is a non-essential value-added feature. Deciding whether features
should be available only to members or to subscribers or to whomever
always generates a lot of internal debate around here. Sounds like a
great application
for a library to provide as well if you ask me. Shouldn't you provide it
for your
end users at Princeton? But your feature request has been noted and who
knows, we may do it some day.

Cheers,
Mark
Received on Mon Apr 01 2002 - 20:52:41 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:46:29 GMT