Re: DASER 2 IR Meeting and NIH Public Access Policy

From: Stevan Harnad <>
Date: Tue, 6 Dec 2005 16:25:14 +0000

On Tue, 6 Dec 2005, David Stern wrote:

> If you had stayed through the final presentation you would have heard
> one other suggestion that directly addresses your highest priority:
> immediately increasing the percentage of OA material in the
> repositories.
> My suggestion was to place OA materials immediately in centralized
> repositories rather than waiting for each researcher organization to
> mount its own Institutional Repository (IR).

David, the option of depositing in an existing OAI archive if/when
the author has no other immediate place to deposit has been treated
repeatedly in this Forum. And of course the commonsense strategy, highly
recommended, is not to wait for an IR if there is not one available,
but to deposit immediately in an existing OAI archive (and then later
encourage your institution to create one). I have been offering offshore
archiving in CogPrints to authors with no IR for years, for example,
and Peter Suber has been negotiating with Brewster Kahle to make the
Internet Archive OAI-compliant and available for overflow archiving for
unaffiliated authors.

That all goes without saying; but lack of archives are not the problem. Lack of
archiving is the problem. Its main symptom is a plenitude of IRs with a
penury of content.

So whereas your suggestion -- that authors who are desperate to
self-archive and have no IR to do it in today should do it in a central
archive -- is welcome, it does not address the real problem today, which
is that authors are not spontaneously desperate to self-archive. And
it is institutional self-archiving mandates that will induce authors
to do it; authors' own institutions are in a position to monitor and
reward compliance; and authors' institutional IRs are also the natural
place for research-funders to mandate that their fundees self-archive
(though funders too could offer a central OAI archive to accommodate
authors whose institutions don't yet have an IR: it can all be harvested
back in due course).

> arXiv was a success because it had an immediate critical threshold of
> materials in a discipline. This would not have happened if we had
> waited for the majority of authors to have IRs.

Arxiv was a success because the physicists were already desperate to share
their articles, and were already doing it in paper. Arxiv happened to start
with a central archive. This was before the days of OAI interoperability. If
OAI had already been there, we would have offered Eprints as a free
OAI-compliant archive-making software then, and OA IRs would have started in
1991. As it is, computer scientists, who have been self-archiving even longer
than physicists, and have self-archived a larger total number of articles,
happen to have done it on their own websites (and before that, their ftp
sites), and all of that was then harvested into a "central" virtual archive,

That's all history today. Institutional archives and central archives are
equivalent and interoperable, and they are not the problem: Getting authors
to self-archive (for their own good, and the good of their institutions,
funders, and research itself) is the problem. Central archiving is certainly
not the solution; in fact, in the OAI-interopreable harvesting age, central
archiving is the more primitive option. (Librarians tend to be more comfortable
with collections located in one physical space, but that is not way it is
done in the digital era.)

That's what I would have said at DASER if my plane's departure time had not
prevented me from attending your talk...

> Many important research
> organizations still do not have IRs, and will not have fully functional
> ones for some time for many reasons which must be accepted as reality.

They can and will, if they wish; and there are no substantive reasons,
functional or otherwise, for not having them (assuming the organizations
have research output at all, that they are publishing, and hence should
be self-archiving: if not, nolo contendere).

And in the meanwhile, as noted, the existing central archives can take up
any immediate slack. But immediate slack is not the problem: the slackness
of 85% of researchers in doing the optimal and inevitable is. And mandates,
not central archives, are the solution.

> Yes, we can harvest the information centrally for those with IRs, but we
> can quickly increase the possibility of mass contributions through
> providing and emphasizing shared repositories for those without IRs.

David, you are dreaming if you think the non-archiving of 85% of the OA target
corpus today is due to the lack of an archive to archive it in: It is due in
(smaller) part to insufficient knowledge of the usage/impact benefits of
self-archiving and in (larger) part to insufficient inclination to self-archive
spontaneously (because of competing priorities, unfounded worries that it's
time-consuming, unfounded worries that it's illegal, etc. etc.).

A clear-cut institutional and research-funder mandate to self-archive
(preferably in the author's own IR) is the solution, as described in my
proposal for fixing the flawed, failed NIH public access policy.

> We really don't need to do anything technical, as arXiv could
> immediately add additional discipline archives. We only need to
> redirect authors to existing infrastructures.

You sound like me, 8 years ago: We've been there, done that, and gotten
nowhere. The problem isn't a pressing tide of would-be self-archivers with no
place to self-archive. The problem is to press the tide, generate the flow...

> Might this be a proactive and significant change in policy resulting in
> immediate positive impact?

In a word: No. It is a symptom of being too far removed from the action to have
gotten a clear sense of what is and is not happening, and why, and hence what
needs to be done to get it happening.

Cheers, Stevan

> At 06:22 PM 12/5/2005, Stevan Harnad wrote:
> This is a summary (from my own viewpoint) of the Washington meeting this
> weekend sponsored by American Society for Information Science &
> Technology
> (ASIST), organized by Michael Leach (Harvard, President, ASIS):
> Digital Archives for Science and Engineering Resources (DASER 2)
> <>
> (For some other slants on DASER 2, see these two blogs; but beware, as
> they do contain some notable garbles and omissions, having been blogged
> in
> real time: Dorothea Salo <>
> and Christina Pikas
> <>
> )
> DASER 2 rehearsed some familiar developments, highlighted some of them,
> and brought out one potentially important new one (re. the NIH Public
> Access Policy).
> The familiar developments were the worldwide growth in institutional
> repositories (IRs), and in new services to help institutions to create,
> maintain or even host IRs: ProQuest (using Bepress software), BioMed
> Central (using Dspace software) and Eprints Services (using Eprints
> software).
> Fedora software was also discussed, but it was quite apparent (at least
> to
> me!) that at this DASER meeting, whose specific focus was digital
> science/engineering resources -- hence Open Access (OA) IRs in
> particular,
> targeting the self-archiving of institutional peer-reviewed
> science/engineering article output, in order to maximise its visibility,
> usage and impact, rather than digital curation in general -- Fedora's
> much
> wider and more diffuse target (the collection and curation of any and
> all
> institutional digital content, incoming or outgoing, research or
> otherwise) was not the urgent priority. Indeed, there are good reasons
> for
> expecting that if the IR movement first puts its full weight and energy
> behind the focussed archiving of 100% of each institution's own OA IR
> target content, that will itself prove to be the most effective way to
> launch and advance the more general digital-curation agenda too.
> There was likewise considerable time devoted to the future of
> publishing,
> with much discussion of OA publishing and the possibility of an eventual
> transition to OA publishing. But here too, the lesson was that the best
> contribution that OA IRs in particular can make to this
> possible/eventual
> transition is to hasten their own transition to the institutional
> self-archiving of 100% of their own OA target content.
> Present and contributing very constructively were the two Learned
> Society
> Publishers in whose discipline author self-archiving has been going on
> the
> longest, and has gone the farthest (having reached 100% years ago in
> some
> fields): The American Physical Society (the first publisher to adopt [in
> 1994] an explicit "green" policy on author self-archiving [today about
> 76%
> of publishers and 93% of journals are green]) and the Institute of
> Physics
> (likewise green, along with some notable experiments in "gold" OA
> publishing).
> The keynote speaker was Jan Velterop, formerly publisher of "pure gold"
> BioMed Central, and now director of OA for Springer's "optional gold"
> Open
> Choice. Jan's main concern was (understandably) to encourage authors to
> pick the gold option and to encourage their institutions and research
> councils to fund the author costs.
> Jan applauded the growth in the IR movement but noted a substantial
> decrease in the number of postings on the American Scientist Open Access
> Forum (AmSci) in 2004-2005 compared to prior years, and worried that
> this
> might reflect a decrease in OA momentum.
> On the contrary: the decreased AmSci volume was intentional. In 2004, a
> new policy for AmSci postings was announced, reserving the Forum for
> concrete, practical discussion of institutional and research-funder OA
> policy design and implementation. AmSci's former open-ended (and
> unending)
> philosophical and ideological debate about open access was instead
> redirected to the many other OA lists that have spawned since the AmSci
> OA
> Forum's inception in 1998:
> "[T]his Forum, the first of what is now a half dozen lists
> devoted to OA matters, is -- as has been announced several
> times -- now reserved for the discussion of concrete,
> practical means of accelerating OA growth." [December 2004]
> <>
> The DASER conference also devoted time and thought to the future of
> librarians in the digital and OA era; again, insofar as IRs are
> concerned,
> a good investment of librarians' available time, energy and resources is
> in helping to create and fill IRs, first OA IRs, and then eventually
> expanding them to wider and wider digital content, thereby again
> facilitating the inevitable and desirable transition. (My own personal
> view, however, is that librarians should abstain from speculation about
> the future of peer review, which is not really their field of expertise;
> I
> also think retraining librarians to become institutional in-house
> publishers may not be the best use of their time and talents.)
> That librarians can be an enormous help in getting institutional authors
> to deposit their OA content in their IRs was illustrated in my own talk,
> using examples from around the world (CERN, Portugal, Southampton) but
> with especially striking data from Australia (with thanks to Arthur Sale
> and Paula Callan). I also reported on the growing evidence for the
> dramatic OA research impact advantage across all disciplines, now
> including the humanities and social sciences, and its implications for
> research and researcher funding and progress..
> The OA impact advantage, IRs, and librarian-help are all *necessary*
> conditions for filling IRs with OA content, but to make them into a
> jointly *sufficient* condition, one further critical component is
> needed,
> and this has been demonstrated in case after case: The only IRs that are
> well along the road toward toward 100% OA are the ones that also have an
> institutional self-archiving requirement. Without that, spontaneous OA
> self-archiving is hovering at about 5% - 15% globally..
> Which brings us to the last and newest development reported at DASER:
> The
> NIH public access policy is flawed and failing -- its deposit rate is at
> about 2%, which is even *below* the global average for spontaneous
> self-archiving. But the good news is that NIH has realized this, and is
> planning to do something about it. The question is: what? There is a
> committee to look at this question, but at a quick glance, it does not
> seem to include those who actually know what needs to be done, and how,
> to
> make the NIH policy work. Represented are librarians and publishers, but
> missing are the institutional OA policy-makers that can make
> self-archiving work.
> But the solution is simple, and NIH can do it, very easily. First, it is
> important to face the 3 flaws of the current NIH policy very
> forthrightly.
> Here they are, in order of severity:
> (1) Deposit is *requested* rather than *required*.
> (2) The request is not for immediate deposit but deposit within one
> year of publication.
> (3) The request is for deposit in PubMed Central (PMC) (rather than in
> the
> author's IR, from which PMC could harvest it).
> The reason the deposit is not required and not immediate is related to
> the
> reason the deposit is in PMC instead of the author's own IR: NIH has
> cast
> itself in the role of a 3rd-party access-provider. This is fine, for its
> own funded research. But then it must deal with its publishers and their
> conditions (which include access-embargoes of up to 12 months, in order
> to
> protect against perceived risks to their revenues).
> OA itself does not require a 3rd-party access-provider. All it requires
> is
> OA! And for that, any OAI-compliant archive, whether the author's own
> institutional IR or a central repository like PMC will do, because they
> are all equivalent and interoperable, in the OAI-compliant age, and all
> accessible to any user or harvester webwide.
> So NIH can have what it wants -- 100% of its funded content in PMC
> within
> a year of publication -- while still requiring deposit immediately upon
> acceptance (preferably in the author's IR, harvestable by PMC, but
> absent
> that, direct deposited in PMC).
> That leaves only the question of how to set the access-privileges, and
> now
> those can be merely the subject of a (strong) request to set them to OA
> immediately upon deposit -- but with the option left open (sic) for the
> author to set access instead as restricted to institution-internal and
> PMC-harvestable (or, for PMC, PMC-administrative-only) if the author has
> reason to prefer that (the reason presumably being that the article is
> published in one of the 7% of journals that are not yet "green" on
> immediate OA self-archiving).
> Is this merely a way of tweaking the current NIH policy so as to get
> deposits up to 100% without getting immediate OA up to 100%? The answer
> is: Yes and No. Yes, this policy will immediately drive up NIH deposits
> from their current 2% level to 100%, because deposit will be a
> fulfilment
> condition on receiving the NIH grant. But no, it is not true that it
> will
> not generate immediate 100% OA. For it can generate that too, with a far
> smaller delay-loop than 12 months: something more of the order of 12
> hours
> at most:
> The solution is very simple (and we are already building it into the
> Eprints IR software): The metadata (author, title, journal, date,
> abstract) are of course all immediately OA for 100% of deposited papers,
> regardless of how the access-privileges for the full-text are set. That
> means that from the moment the text is deposited, the metadata are
> visible
> and accessible to all would-be users webwide, thanks to OAI and the OAI
> search engines, as well as to google scholar and the non-OAI search
> engines.
> But what about the full-text? For about 7% of journal articles (the ones
> in the non-green journals), access will not be immediately set to OA.
> What
> the Eprints software will do when a would-be user encounters this
> dead-end
> is that the IR interface will provide a link that will pop up a window
> allowing the user to send an automatic email to the author (whose email
> address is part of the IR's internal metadata) requesting to be emailed
> an
> eprint of the full-text in question. The requester's email will be sent
> by
> the software -- automatically and immediately -- to the author, with a
> prepared URL that the author need merely click on, in order to have the
> eprint immediately emailed to the would-be user.
> This author-mediated access-provision is not quite as convenient,
> instantaneous or sensible as immediately setting the full-text to
> unmediated OA, so the user can just click to down-load it, but it is
> effective 100% OA just the same. And NIH can (as now) harvest the
> full-text whenever it likes, and can go on to make it OA in PMC whenever
> it elects to. None of that will be holding back OA any longer.
> This immediate-deposit requirement is also the form that the RCUK policy
> is now taking; and it offers a general model for the rest of the world
> to
> adopt too.
> Note that this slightly modified policy completely side-lines all
> publisher objections: It is merely a deposit requirement, not an OA
> access-setting requirement. It is left up to researchers and the
> would-be
> users of their research to sort out access-provision according to the
> needs of research -- exactly as it should be.
> This is of course also the policy that institutions should adopt, for
> their own institutional research output, whether or not funded by NIH or
> RCUK. An immediate-deposit requirement will result in IRs worldwide
> filling virtually overnight (at long last).
> (The other thing NIH should do is to couple its deposit requirement with
> an explicit statement of NIH's readiness to cover OA journal publication
> charges for those NIH fundees who choose to publish their findings in an
> OA journal.)
> Stevan Harnad
> David Stern
> Director of Science Libraries and Information Services
> Kline Science Library
> 219 Prospect Street
> P.O. Box 208111
> New Haven, CT 06520-8111
> phone: 203 432-3447
> fax: 203 432-3441
> email:
Received on Tue Dec 06 2005 - 16:38:59 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:07 GMT