Re: Publisher's requirements for links from published articles

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Fri, 25 Apr 2008 15:27:52 +0100

On Fri, 25 Apr 2008, Jean-Claude Guédon wrote:

> Let me summarize the procedure I envision. This may help clear things
> up.
>
> 1. An IR checks if the deposited article is the same as the published
> article.
> 2. The IR declares that it is the same as the published article, for
> example in a general header that is visible from any part of the site.
> 3. Because the article in the IR has been declared equal to the journal
> article, it can be cited in its IR version and not in the journal
> version. if IR's agreed to use a common persistent identifier, this
> would be very useful.

What is cited is a *work*, not a physical instance of that work.

The work is a published refereed-journal article. The citation is author,
date, title, journal, volume, issue, page-span.

The URL (or DOI) then specifies a locus where you can access that work. A
fine addition to the citation information. But what is *cited* remains
the canonical published work.

The institution, in authenticating the text of its physical instance,
does not thereby become the publisher or the locus of publication of
the work. That continues to be the journal in which it was published. It
has simply provided an (open) access location.

> 4. Because the IR version is citable, its way of laying out the document
> is as good as any. If it uses pages, then the IR pagination is fine for
> the citation. The need to go back to the journal version to check the
> page number for citation purposes disappears.

What is cited is the work.

If, in citing the work, specific passages are quoted or noted in the text,
their locus in the work can always be indicated by section headings plus
paragraph numbers. Those are work-internal landmarks and do not depend
on formatting.

For those who still cleave to page numbers in the online age, and who do
not know on what pages of the journal's PDF a quoted passage appears, they
can pinpoint the passage by using the pagination of the IR version's PDF
if they wish, but that sounds like a convoluted way to do it, and makes
the new work dependent on the format and locus of a specific instance
of the cited work, instead of the canonical work itself. It seems to me
the disadvantages vastly outweigh the advantages.

http://users.ecs.soton.ac.uk/harnad/Hypermail/Theschat/0037.html
http://users.ecs.soton.ac.uk/harnad/Hypermail/Amsci/0288.html
http://www.library.yale.edu/~llicense/ListArchives/0610/msg00165.html
http://www.library.yale.edu/~llicense/ListArchives/0610/msg00169.html
http://media.library.ku.edu.tr/refpgs/sociology/style_apa.htm

Here is APA style (adopted on the basis of my recommendations in the
mid-90's)

    "For electronic sources that do not provide page numbers, use the
    paragraph number, if available, preceded by the ¶ symbol or the
    abbreviation para. If neither paragraph nor page numbers are visible,
    cite the heading and the number of paragraph following it to direct
    reader to the location of the material."

> 5. As a courtesy, a reference could be optionally added to the effect
> that the article is also available in a given journal.

The origin of this discussion was less about courtesy to the publisher,
but the request, by some Green publishers, that in exchange for their
endorsing immediate Green OA self-archiving in their authors' IRs, the
author include a link to the publisher's URL or DOI for the canonical
version. That not only seems a small price to pay for the publisher's
blessing on self-archiving, but it also seems to be good scholarly
practice in the online and OA era.

> 6.The IR ought to mention the journal version. This would allow
> aggregating the citations coming to the same article through various
> channels. It would also allow directly measuring the role of OA IR's in
> citation counts.

The citations are to the work, not to the locus or the physical
instance: ISI, Google Scholar, Scopus, Citebase and Citeseer do not care
about the physical instance but about what work is cited by other works,
how many times.

No aggregation necessary, except if the published work was *preceded* by an
unpublished preprint, and that preprint was cited before the canonical
bibliographic citation was available. (Authors usually hurry to update
the bibliographic citations of their self-archived drafts, specifically
to ensure that the published wrk is cited and not the unpublished draft.)

This problem soluble (by smart citation aggregation), and worth solving,
though one might want to treat references to the unrefereed preprint
separately for scholarly purposes.

(Downloads, however, are definitely good to aggregate.)

Stevan Harnad

> Does this clarify matter?
>
> jc
>
> Le vendredi 25 avril 2008 à 03:20 +0100, Stevan Harnad a écrit :
>
> > On Wed, 23 Apr 2008, Jean-Claude Guédon wrote
> >
> > > In many disciplines, citability requires going to the page level. If
> > > the
> > > deposited article in an IR is not paginated in the same fashion as in
> > > the journal, it is no longer citable as a journal article and one has
> > > to
> > > go back to the journal to cite the passage correctly, down to the page
> > > number.
> > >
> > > My suggestion is that the IR simply declares that the article
> > > deposited
> > > is conformant to the published version and, as such, citable as is.
> > >
> > > In other words, the version of the article would be as good a
> > > reference
> > > as the peer-reviewed version of the article.
> >
> > My perplexity is genuine:
> >
> > If I cannot afford access to the toll-access version of a published
> > journal article, but I do have access to a self-archived Open Access
> > version of it, lacking page numbers, I understand how it might be
> > useful to have a reliable version-comparer confirm that the two texts
> > are substantially the same -- as http://valrec.eprints.org/ does -- and
> > I said so in my original comment below: Authentication (institutional or
> > otherwise) of the self-archived draft is welcome and useful (but not a
> > priority: the drafts themselves, mostly still not self-archived today,
> > are the priority).
> >
> > But how on earth does the version-authentication of the self-archived
> > draft of a published, peer-reviewed journal article take care of the
> > page-reference problem (and is it really a problem?)?
> >
> > If the problem is finding the page-span for the journal reference, and
> > the self-archived draft lacks it, one can of course always find it in a
> > bibliographic database (or one can let the copy editor of the journal in
> > which one is publishing one's own article find it).
> >
> > If it is to find the pages on which quoted or noted passages appear, I
> > would say section headings plus paragraph numbers pinpoint them just as
> > well,
> > if not better, in the PostGutenberg era.
> >
> > If an editor is pedantic enough not to be prepared to settle for section
> > headings plus paragraph numbers to specify cited passages in the
> > original
> > published journal article, it is highly unlikely that he will want to
> > settle instead for section headings plus paragraph numbers to locate the
> > same passage in the supplementary version of that article, self-archived
> > in the author's IR, in order to make it OA for those who cannot afford
> > access to the published version (whether or not that supplementary
> > version has been institutionally verified as a bona fide doppelganger of
> > the original published article -- in all but the page numbers)!
> >
> > (I won't even consider the even more baroque variant of generating a
> > paginated PDF of the self-archived supplement, merely in order to
> > satisfy the residual Gutenberg compulsion to have page numbers at all
> > costs, even when it puts them in competition with the official published
> > version!)
> >
> > Stevan Harnad
> >
> > On Wed, 23 Apr 2008, Jean-Claude Guédon wrote:
> >
> > > I think Jean-Claude has perfectly understood the question and it is
> > > one
> > > that was debated some time ago.
> > >
> > > In many disciplines, citability requires going to the page level. If
> > > the
> > > deposited article in an IR is not paginated in the same fashion as in
> > > the journal, it is no longer citable as a journal article and one has
> > > to
> > > go back to the journal to cite the passage correctly, down to the page
> > > number.
> > >
> > > My suggestion is that the IR simply declares that the article
> > > deposited
> > > is conformant to the published version and, as such, citable as is.
> > >
> > > In other words, the version of the article would be as good a
> > > reference
> > > as the peer-reviewed version of the article.
> > >
> > > As for branding issues, I do not remember raising them in the message
> > > mentioned here.
> > >
> > > Best,
> > >
> > > jcg
> > >
> > > Le mercredi 23 avril 2008 à 14:56 +0100, Stevan Harnad a écrit :
> > >
> > > > I think Jean-Claude may have misunderstood the question at issue
> > > > here:
> > > >
> > > > It concerns the depositing of peer-reviewed, published articles in
> > > > the
> > > > author's Institutional Repository so that they can be accessed by
> > > > all
> > > > would-be users, not just those who can afford access to the journal
> > > > in
> > > > which it was published.
> > > >
> > > > The specific question was about how to provide the link to the
> > > > publisher's official version, if authors wish to provide it (for
> > > > scholarly purposes, as they should!), or because a Green publisher
> > > > has
> > > > requested that it be provided, in exchange for their blessing on
> > > > immediate OA self-archiving.
> > > >
> > > > There is not issue of citability: The published article is perfectly
> > > > citable, as always. Nor is there any issue of institutional
> > > > "branding":
> > > > the branding is done by the peer-reviewed journal and its
> > > > track-record
> > > > for quality. The institution merely provides access to the final
> > > > refereed draft:
> > > >
> > > > On Tue, 22 Apr 2008, Jean-Claude Guédon wrote:
> > > >
> > > > > One important suggestion in this regard is to make the stored
> > > > > article
> > > > > citable.
> > > >
> > > > The stored article is (the author's final refereed draft
> > > > ["postprint"]
> > > > of) a published, peer-reviewed journal article.
> > > >
> > > > Journal articles are already citable (author, date, title,
> > > > journalname,
> > > > volume, issue, pages, etc.).
> > > >
> > > > In addition, it is a good idea to have a link, in the citation
> > > > itself,
> > > > to an openly accessible version of the published article (not just
> > > > the
> > > > publisher's toll-access version).
> > > >
> > > > That is what depositing the postprint in the author's Institutional
> > > > Repository is for: To provide free access to the published article.
> > > >
> > > > Not to provide something else, citable in its own right (except of
> > > > course the pre-refereeing preprint, is should only be consulted and
> > > > cited until the refereed postprint becomes available).
> > > >
> > > > > Any academic institution with a good name can provide the
> > > > > check needed to guarantee this status to any stored article.
> > > >
> > > > It is ambiguous whether what Jean-Claude means here is that the
> > > > institution should make sure that what has been deposited by the
> > > > author as
> > > > a postprint of the journal-published article is indeed the final
> > > > refereed
> > > > draft of the published journal article. (Such institutional
> > > > authentication
> > > > is welcome, but it is not, strictly speaking, necessary, as what is
> > > > mostly
> > > > missing now is the postprints themselves, not their
> > > > authentications.)
> > > >
> > > > Or what Jean-Claude may mean here is an extension of the "branding"
> > > > that
> > > > has been discussed before -- and that (in my view) conflates
> > > > unpublished
> > > > papers, unrefereed preprints, and published postprints.
> > > >
> > > > The query below pertained to refereed postprints, OA's target, not
> > > > to
> > > > unpublished papers in need of an institutional "brand."
> > > >
> > > > Harnad, S. (2005) Fast-Forward on the Green Road to Open
> > > > Access:
> > > > The Case Against Mixing Up Green and Gold. Ariadne 42.
> > > > (Japanese
> > > > version) http://eprints.ecs.soton.ac.uk/10675/
> > > >
> > > > > From that
> > > > > point on, the link to the publisher, even if needed, loses
> > > > > importance
> > > > > because the open nature of the article will steer users in its
> > > > > direction.
> > > >
> > > > The link to the publisher of a published article loses its
> > > > importance? I
> > > > agree it is not important for access, given that the OA version is
> > > > accessible and the user cannot afford the toll-access version. But
> > > > surely the publisher link is useful for the scholarly record -- and
> > > > in
> > > > case anyone may wish to compare the versions. (Not to mention that
> > > > some
> > > > publishers "require" it as a condition for self-archiving the
> > > > postprint.)
> > > >
> > > > > Of course, some persistent access means will also be needed.
> > > >
> > > > IR's provide persistent access; so do publishers. What's still
> > > > missing
> > > > today is 85% of the postprints to which persistent access can then
> > > > be provided! (That's what the mandates are for.) Meanwhile, no harm
> > > > in accommodating publishers' minor conditions on endorsing Green OA
> > > > self-archiving -- especially if it also serves a useful scholarly
> > > > purpose.)
> > > >
> > > > Stevan Harnad
> > > >
> > > > > Le mardi 22 avril 2008 à 10:36 -0400, Stevan Harnad a écrit :
> > > > > > On 22-Apr-08, at 10:12 AM, dspace-general-request_at_mit.edu wrote:
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Date: Mon, 21 Apr 2008 14:24:29 -0700
> > > > > > > From: "Jeremy C. Shellhase" <jcs -- lib-mail.humboldt.edu>
> > > > > > >
> > > > > > > We're working to include more of our faculty's published works
> > > > > > > in
> > > > > > > our instance of dspace, Humboldt Digital Scholar, and wanted
> > > > > > > to pose
> > > > > > > a couple questions about "best practices" in complying with
> > > > > > > some of
> > > > > > > the RoMEO green publishers requirements, before we got too far
> > > > > > > along
> > > > > > > in the work.
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > SHERPA RoMEO "Green" is not quite the right category, because it
> > > > > > means
> > > > > > "BOTH postprint-Green AND preprint-Green" whereas what you
> > > > > > should be
> > > > > > covering is postprint-Green, whether or not the publisher also
> > > > > > happens
> > > > > > to be preprint-Green, and you should also look carefully at the
> > > > > > preprint Greens, because many of them mean "postprint" (author's
> > > > > > final
> > > > > > refereed draft) even though they say "preprint" (unrefereed
> > > > > > draft)
> > > > > > wrongly thinking that "postprint" means publisher's PDF!
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Publishers frequently ask for a link back to their online
> > > > > > > presence
> > > > > > > with statements like:
> > > > > > > * Must link to publisher version
> > > > > > > * Must link to publisher version or journal home page
> > > > > > > * Must link to APA journal home page
> > > > > > > We've looked in the metadata fields available and cannot
> > > > > > > really find
> > > > > > > a perfect place for this information and link. Has anyone set
> > > > > > > a
> > > > > > > standard practice for this using metadata?
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > There should be an "other locations" field in DSpace, as there
> > > > > > is in
> > > > > > EPrints. (If not, someone should quickly create/configure one.)
> > > > > >
> > > > > >
> > > > > > That's the place to put the link to the publisher link.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > The other option is to include this information as a
> > > > > > > preliminary
> > > > > > > page added to the actual submission, embedding the information
> > > > > > > in
> > > > > > > the digital object itself. If there are any other great ideas
> > > > > > > floating around, we'd sure like to hear.
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Yes, that's an option, and not bad as scholarly practice. But
> > > > > > since it
> > > > > > entails more work for the author, and since it's already like
> > > > > > pulling
> > > > > > teeth to get them to deposit, it's probably more efficient to
> > > > > > use the
> > > > > > "other locations" field in the IR interface.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Publishers frequently state that "Publisher version cannot be
> > > > > > > used",
> > > > > > > allowing only the author's pre or post refereeing drafts.
> > > > > > > Well, as
> > > > > > > it turns out, many of the faculty that have time to consider
> > > > > > > archiving their legacy are emeritus or close to it and the
> > > > > > > publications they're interested in archiving no longer have a
> > > > > > > digital author's copy available. We're stuck with how to
> > > > > > > proceed,
> > > > > > > if indeed we can. Does scanning and OCRing a printed copy of
> > > > > > > an
> > > > > > > article satisfy this requirement?
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > I agree completely with the previous reply by Shane Beers below:
> > > > > > Just
> > > > > > "repurpose" the PDF or scanned OCR.
> > > > > >
> > > > > >
> > > > > > Stevan Harnad
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > From: Shane Beers <sbeers -- gmu.edu>
> > > > > > >
> > > > > > >
> > > > > > > > I've discussed this in past dspace threads, but I'll mention
> > > > > > > > it
> > > > > > > > again
> > > > > > > > here. I frequently use a software called ABBYY FineReader
> > > > > > > > Pro
> > > > > > > > (http://www.abbyy.com/finereader8/?param=44890
> > > > > > > > ), which allows one to import an existing PDF and re-purpose
> > > > > > > > the
> > > > > > > > content. I've been thinking about writing up a guide to
> > > > > > > > using
> > > > > > > > ABBYY to
> > > > > > > > do this, but it's not difficult to figure out, in my
> > > > > > > > opinion.
> > > > > > > > Essentially you take the content and de-select things like
> > > > > > > > page
> > > > > > > > headers/footers/etc and create a new PDF that uses the same
> > > > > > > > textual
> > > > > > > > content, but does not contain any publisher information.
> > > > > > > > This
> > > > > > > > successfully side-steps that issue, in my not-a-lawyer point
> > > > > > > > of
> > > > > > > > view.
> > > > >
> > > > > Jean-Claude Guédon
> > > > > Université de Montréal
> > > > >
> > > > >
> > >
> > > Jean-Claude Guédon
> > > Université de Montréal
> > >
> > >
>
> Jean-Claude Guédon
> Université de Montréal
>
Received on Fri Apr 25 2008 - 15:31:59 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:18 GMT