Re: Repositories: Institutional or Central?

From: Tomasz Neugebauer <Tomasz.Neugebauer_at_CONCORDIA.CA>
Date: Tue, 10 Feb 2009 14:06:18 -0500

 

On Mon, Feb 9, 2009 at 3:56 PM, Tomasz Neugebauer wrote:

       

      A granting agency can make open access to the results of
      the research a condition of funding, but a university
      mandate that makes the university IR the compulsory locus
      of deposit... is not a good idea.  An appeal to
      individualism of the researchers should be sufficient...

 

Steven Harnad: " IRs have been going up for nearly 10 years now;
there are over 1000 of them, and the unmandated ones all remain
near-empty (< 15% ), whereas the (very few) mandated ones approach
100% annual deposit within about two years: How much longer do you
propose we go on waiting for "individualism" (unmandated deposit) to
start proving to be sufficient?"

 

 

Response:  I think it is a mistake to equate "individualism" with
unmandated deposit - that was not my argument.  My point was about
how to achieve mandates: I think it is imperative that the
researchers/faculty understand the need for a mandate and indeed,
help to formulate it themselves.  Of course this implies that if they
decide against adopting such a mandate, then their decision has to be
respected.  This is in contrast to arguing for a mandate from the
principle of a collective vision that is somehow universally true. 
The argument that individualistic deposit rates are insufficient in
IRs can be extended to the argument that adoption rates of mandates
has been insufficient.   Why aren't institutions adopting deposit
mandates?  Could it be, in part, because of some mistake in the
strategy for convincing them to adopt it?  That was the point of that
part of my argument.  How do you explain the slow pace of adoption of
mandates?  

 

Steven Harnad: "Meanwhile, those of us who have already been through
this many times will continue to advocate the solution for which
there is already the evidence that it works."

 

Response: There is evidence that a mandate increases submission
rates, but there is also the reality of the evidence that
universities are not adopting mandates.  I think that the message
"faculty are not free agents and have to comply with a global vision"
is part of the problem.

 

 

Steven Harnad: "High Energy Physicists, who have been spontaneously
self-archiving at close to 100% since 1991, happen to deposit in a
CR, Arxiv, which today has a total of 518,884 documents (including
other subfields of physics, plus a few other subdisciplines as well.
(All figures are from ROAR.)

 

Computer Scientists, who have been self-archiving even longer, mostly
deposit on their own distributed institutional websites, and these
deposits are then harvested by their field's CR, Citeseerx -- but
that CR too has a healthy total for its field of 716,772 documents.
So does Repec, with 774,432 documents, which is likewise a harvested
rather than a direct-deposit CR.

 

There are two lessons to be learned from these data. First, even in
CRs with a strong normalized deposit rate (as in computer science,
economics and physics), their success has nothing to do with locus of
deposit, since two of these three big ones is harvested from
distributed institutional websites. That makes them a lot more like
google or google scholar than like repositories, since no one
deposits directly in google and google scholar. Second, the success
of this CRs has everything to do with discipline-specific practices,
because all three of these disciplines have long had the practice of
sharing pre-refereeing drafts (preprints) before publication. "

 

 

 

Response: The success of arXiv in terms of usability to researchers
and submission rates, combined with the fact that arXiv is a central
deposit repository is evidence in favor of CR based on subject; it is
also evidence that does not support the claim that "their success has
nothing to do with locus of deposit".  Physicists deposit to arXiv
directly.  Citeseerx and Repec are interesting attempts at
interoperability with other repositories, but many documents that are
contained there are not available in any IR and are harvested from
individual faculty pages, for example.  My point here is that the
relationship between IRs and subject repositories is still not well
defined.  Who will be submitting to the subject repositories; will
this happen automatically according to an IR policy, according to the
subject repository policy or will the researcher continue to do this
independently?

 

 

Steven Harnad: "This brings us to the biggest CR of all, PubMed
central, with 1,525,967 documents: Most of these are not deposited by
their authors at all (only the annual 80,000 mandated by NIH are);
the rest are the result of various arrangements with the publisher,
after various embargo periods of up to a year or more have elapsed.

 

We have now surveyed the top 4 CRs, to discover that there is in fact
no lesson at all to be learnt there by IRs, on how to overcome the
15% spontaneous-deposit baseline. It has nothing to do with local vs
central deposit, nor with the functionality of CRs."

 

 

 

Response:  Given that one of those 4 CRs is Arxiv, where "High Energy
Physicists, who have been spontaneously self-archiving at close to
100% since 1991, happen to deposit in a CR", it is difficult to
accept that the CR aspect is irrelevant to its success.  I suppose
your argument is that this (the fact that arXiv is a CR) is pure
coincidence?  You have only looked at 4 `largest' CRs, but 1 out of
the 4 is contrary to your argument.  I think that in other fields,
like humanities and social sciences, although the CRs might be
smaller, they are not less important or relevant. 

 

 

 

Steven Harnad: "The CR functionality issue is even more of a red
herring, because of course users will consult the harvested global
service, the CR, not the individual, distributed local sources, the
IRs, for navigation and search, just as they consult google and
google scholar. It would be absurd to implement sophisticated direct
search capability at the single IR level, when the obvious locus for
search is the central harvester level -- and again, that has nothing
whatsoever to do with whether the central service is itself a locus
of direct deposit, like Arxiv, or harvested from distributed local
sites, like citeseerx or google scholar."

 

 

Response:   I understand this argument, but consider a principle from
human centered design: "the appropriate allocation of function" with
respect to user and technology.  In a CR like arXiv, it is the
researchers themselves, through their individual choices, that decide
to place their article in the repository - working within the context
of where the user will be finding the document.  All of those
individual decisions of researchers to deposit articles in arXiv
create the emergent properties of the collection, and arXiv is also
used as a search tool.  The harvested global service threatens to
replace that distributed human function with a series of
probabilistic algorithms (based on user defined policies?) that group
the results into sets by subject.  I think that this is a real
distinction that is relevant.   

 

Steven Harnad: "Tomasz, now that you have voiced your own opinion, it
would be a good idea for you to read the background literature on
this topic. There you will find the large, multidisciplinary and
multinational author surveys that were conducted several years ago by
Alma Swan and Sheridan Brown, in which researchers did indeed voice
their opinion, and their opinion was that they would not deposit
until and unless it was mandated by their institutions and/or
funders, but that if and when deposit was indeed mandated, 95% would
deposit, and over 80% would deposit willingly. This finding has since
been confirmed by others; and Arthur Sale has gone on to do studies
on authors' actual behavior, with and without a mandate, to find that
authors do indeed behave in accordance with the opinion they voiced
in the Swan/Brown surveys, with their actual deposit rate approaching
100% within two years of the adoption of a deposit mandate (but
languishing at the baseline 15% -- or 30% if incentives and
assistance are provided -- if deposit is not mandated)."

 

Response:  I am familiar with some of the background literature on
this topic, but I always appreciate a link to a relevant study, thank
you.  My argument is not against mandates, my argument is about how
to achieve them - and that includes the right of a department or
faculty at a university to refuse to adopt one.  In my opinion, a
statistical study of a thousand authors from all over the world (a
bias towards science respondents seems to be present), is not a
legitimate basis on which to dictate a mandate to a faculty at a
particular institution - it is preferable to convince the faculty at
the particular institution to adopt a mandate.  I think that the
difference is subtle, but nevertheless very important.  If it is true
that the majority of faculty want to support open access with a
mandate, then there shouldn't be any problems in getting universities
to adopt the mandates.  Yet, the problem persists.  There has been
much contemplation about the insufficient submission rates for
self-archiving, what about the causes of insufficient adoption of
mandates?  I think one of the causes is insufficient consideration
for the importance of emergent properties of collections that are
created in a distributed way by the researchers through individual
acts of submission to repositories - I think that these properties
are important to researchers.  Another reason, I think, is an
overemphasis on collectivist argument strategy in promoting
mandates.  What do you think is the cause? 

 

Thank you for your response, by the way, I appreciate your thoughts
on the subject.

 

Tomasz Neugebauer
Received on Wed Feb 11 2009 - 15:29:44 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:41 GMT