Re: Central versus institutional self-archiving

From: Stevan Harnad <>
Date: Sun, 8 Aug 2004 23:04:08 +0100

Prefatory note: I strongly support the House/NIH proposal to mandate
self-archiving of NIH-funded research, but I think it is important to
get it amended so it gets it right. It now has to go to the Senate,
and it needs more thought to make it viable and optimal.

    "Re: Mandating OA around the corner?"

    "AAU misinterprets House Appropriations Committee Recommendation"

For what it's worth, I support the Sabo Bill too, in principle, but that
too needs work to make it viable, and I and many others have suggested
what needs to be done to make it work.

    "Public Access to Science Act (Sabo Bill, H.R. 2613)"

It would not have been a service to OA to support either of these
Bills unreservedly and verbatim: Bills are written to be improved
and optimized through informed feedback. -- The two Bills (Sabo and
House/NIH) should probably be combined into one now, if possible,
simply replacing Sabo's "public domain" with "self-archived" and
replacing the House's "self-archived in PubMed Central OA Archive"
with " self-archived in the Author's Institutional OA Archive", with
PubMed Central then recommended to harvest the metadata from all those
distributed self-archived biomedical papers using the OAI protocol, and
to provide a backup locus for archiving the full-text too if the author
has no institutional OAI archive to deposit it in yet. This will help
pressure the remaining 16% of gray journals to go green and institutions
to create OAI archives:

On Sun, 8 Aug 2004, Heather Morrison wrote:

> One of Richard Durbin's points which I think is particularly important
> and bears repeating, is that Pubmed (Medline) is a superior search
> tool. Although, in my opinion, OAIster is an excellent search tool,
> and distributed archiving a needed approach, when it comes to searching,
> no general tool can match a searching & indexing tool that is developed
> to meet the particular needs of a discipline. A search that begins
> with Pubmed and leads the individual to the fulltext provided through
> OA - regardless of where the article is archived - is the best means of
> connecting the user as directly as possible with exactly the information
> they need, in the medical arena.

But that is exactly my point too! Separate the question of the
indexer/search engine (PMC is definitely superior) from question of the
*locus* of the self-archived full-text (i.e. *where* it is archived)! That
is exactly what the OAI harvesting/interoperability protocol is about
and for. The present text of the House Committee recommendation is
needlessly and counterproductively mandating something over and above
what is needed to make all the self-archived NIH research searchable
via PMC! It is mandating that the full-text must be self-archived *in
PMC* -- whereas PMC could just as easily merely harvest the metadata
from whatever OAI-compliant Archive the full-text is actually in,
thereby allowing the full benefits of the congressional mandate to
propagate across institutions and disciplines rather than needlessly
restricting them to the special case of PMC-archiving and NIH research.

Moreover, once the UA and UK mandates do their work, and the OA content is
at last out there, I assure you that far more indexing/search-engine wonders
will spawn over it than any that PMC has yet dreamt of so far!

> If the articles are housed in a central server, then ideally they would
> also be able to be searched via OAIster as well - that way, users
> who are looking for other kinds of information besides the strictly
> medicine-based, will find what they need as well.

That's a foregone conclusion, and part of what I too said in my posting:
PMC is already one of the archives indexed by OAIster. But that is
not a reason for restricting the Congress's self-archiving mandate to
self-archiving in PMC!

> May I also suggest that central vs. distributed archiving, with OA,
> is not an either-or proposition? An OA article housed at Pubmed can be
> easily included in an institutional archive and the author's own website
> as well. Given that the most basic of technology issues regarding the
> archiving and preservation of material in electronic format have yet to
> be worked out, the safest approach, and the one I would recommend, is
> all of the above (central plus institutional plus author's own website).
> This would fit with the LOCKSS (lots of copies keep stuff safe) principle.

I agree on redundancy, but this misses the logic of the amendment I am
recommending: Explicitly mandating central/PMC self-archiving of course
does not *prevent* authors also self-archiving those papers
institutionally (which would bring all the further benefits, in
generalizing across institutions and disciplines that I mentioned). But
explicitly mandating central/PMC self-archiving also does not mandate or
even make it likely that authors will also self-archive institutionally.

Institutional self-archiving is the more powerful and general
strategy. Institutional self-archiving can be mandated and
PMC harvesting can be arranged automatically, but not vice
versa! Redundant archiving can also be recommended (but it too can be
set up automatically). Mandating central is the tail wagging the dog
(or rather, cutting off the tail and throwing away the dog!).

> It also seems to me that there is no reason why there needs to only one
> approach to OA. One approach might be more suitable for one discipline
> or sub-discipline than another. For example, if there is any group where
> the tendency to publish is relatively small because that particular
> discipline does not place quite the same emphasis on publish-or-perish as
> other disciplines, then perhaps publishing could have lower costs due to
> lower submission rates leading to lower rejection rates. Physics seems
> to do doing well with preprints, whereas in other areas it might be more
> important to ensure that readers looked at the corrected postprint.

This seems to be mixing together the access problem and the pricing
problem: Researchers need OA to maximize their access and impact
even if their journals are sold *at cost*. One size does fit all --
all 2.5 million articles in all 24,000 peer-reviewed journals, across
all disciplines, price-ranges, and publication-pressures -- and that
size is: OA. And the way to provide that OA quickly and surely is
to self-archive, now. And the way to make the effect of mandating
self-archiving generalize the fastest and furthest beyond the specific
corpus in question is to mandate institutional self-archiving (or just
self-archiving), *not* to mandate self-archiving in a specific central
archive. There is no need or advantage from that; all the same advantages
can be had via institutional self-archiving, but, critically, further
advantages as well. And this is not so vice versa.

So it's worth getting this Senate Bill right, rather than rushing into
a needlessly restrictive version that misses the chance to maximize the
propagation of OA beyond the bounds of just the NIH-funded corpus, because
of basic misconceptions about the nature of distributed digital information
and interoperability.

> As long as the results are OA, the details of where and how things
> are published don't really matter, do they? Therefore, I would second
> Richard's suggestion that those who advocate for OA should be unanimous
> in our support for the NIH proposal.

And I would say (for much the same reason -- which is that for OA it does
*not* matter where a given full-text is) that we should amend the NIH
proposal so as to get it right, with nothing to lose and a lot to gain!

Stevan Harnad
Received on Sun Aug 08 2004 - 23:04:08 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:33 GMT