(wrong string) ège]

From: Stevan Harnad <amsciforum_at_GMAIL.COM>
Date: Thu, 5 Feb 2009 16:50:19 -0500

On Thu, Feb 5, 2009 at 12:34 PM, Chanier Thierry
<thierry.chanier_at_univ-fcomte.fr> wrote:

      I agree. The question of tools for central repository
      (CR) is central.

      - it is preferable to avoid opposing CR and
      (Institutional repository) IR.

They are not opposed. Both are welcome and useful. What is under
discussion is locus of deposit. (The deposited document itself, once
deposited, may be exported, imported, harvested to/from as many
repositories as desired. The crucial question is where it is actually
deposited, and especially where deposit mandates from funders
stipulate that it should be deposited.)

The issues for locus-of-deposit are:

      (1) Single or multiple deposit? 

I think everyone would agree that at a time when most authors (85% )
are not yet depositing at all, this is not the time to talk about
depositing the same paper more than once.

      (2) If single deposit: where, institution-internally or

The author's institutional repository (IR) might be his university's
IR, or his research institute's IR, or the IR of some subset of his
institution, such as his department's IR or his laboratory's IR. The
point is that the locus of production of all research output --
funded and unfunded, in all disciplines and worldwide -- is the
author's institution. The author's institution also has a shared
stake and interest with its authors in hosting and showcasing their
joint research output.

All other links to the author's research are fragmented: Some of it
will be funded by some funders, some by others, and some will be
unfunded. Some will be in some discipline or subdiscipline, some in
another, some in several. There is much scope for collecting it
together in various combinations into such institution-external
collections, but it makes no sense at all to deposit directly in some
or all of them: One deposit is enough, and the rest can be harvested
automatically. The natural and optimal locus for that one deposit is
at the universal source: the author's own institution.

      (3) Import/Export/Harvest from where to where?

The natural and optimal procedure is: deposit institution-internally
and then, where desired, import/export/harvest
institution-externally. This one-to-many procedure makes sense from
every standpoint: Single convergent deposit, convergent mandates,
maximal flexibility and efficiency, minimal effort and complication
(hence maximal willingness and compliance from authors). The
alternative, of many-to-one importation, or many-to-many
import/export means multiple, divergent deposit, divergent mandates,
reduced flexibility and efficiency, increased effort and
complications (and hence reduced willingness and compliance from


      In some countries, CRs may be prominent (particularly
      because local
      institutions have a low status, so IRs may not mean much
      to researchers ...
      when they exist), because centralized procedures for
      evaluating research
      may offer opportunity to researchers to start depositing
      - see hereafter
      about France -).

Institutional status-level is irrelevant, because research is not
searched at the individual IR level but at the harvester (CR) level.
We are discussing here what is the optimal locus of deposit, so as to
capture (and mandate the capture) all of OA's target content,
worldwide, and as quickly and efficiently as possible. What matters
for this is to find a procedure for systematically capturing all
research output, and the natural and exhaustive locus for that is at
the source: the institution (university, research institute,
department, laboratory) that hosts the researcher, pays his salary,
and provides his institutional affiliation.

There is of course research evaluation at the institution-internal as
well as the institution-external (funder and national) level. But
even for national research assessment exercises, such as the RAE in
the UK, the institution and department are the "unit of assessment";
they are local, and distributed. And the natural locus for their
research output is their own IRs. And that is exactly how many UK
universities provided their submissions to RAE 2008. See the IRRA .


      - Researchers should be free to choose where they deposit
      but with
      requirements to deposit. They may do it in different
      repositories (I mean
      one document is only in one place, but depending on the
      nature of the
      document / data, one may choose various repositories)

I am afraid that it is here that we reach the gist of the matter (and
the height of the misunderstanding and equivocation):

First, the only kind of deposit under discussion here is OA's primary
target content: refereed journal articles. That is also the only
deposit requirement (mandate) under discussion here, because although
there are many other things an author might choose to deposit too --
books, software, multimedia, courseware, research data -- those are
optional contents insofar as OA deposit mandates are concerned. And
it is specifically the locus of deposit of the required contents
(refereed journal articles) that matters so much, particularly in
funder mandate policies.

So whereas it may seem optimal for a funder to simply require deposit
in some OA repository or other, but to leave it up to the author to
choose which (and such a funder mandate is certainly preferable to a
mandate that specifies deposit in a CR, or to no mandate at all),
this is in fact far from being the optimal mandate, for the reasons
discussed by Prof. Rentier: 

Most researchers (85%) do not deposit unless they are required to.
Funders can only mandate the deposit of the research that they fund.
If they require that it must be deposited in a specific CR, they are
in direct competition with institutional mandates (necessitating
double or divergent deposit). If funder mandates simply leave it open
where authors deposit, then they are not in competition with IR
mandates, but they are not helping them either. As noted,
institutions are the producers of all research output -- funded and
unfunded, in all disciplines, worldwide. Only 30 institutions mandate
deposit so far, worldwide (out of tens of thousands). If a funder
mandates deposit, but is open-ended about locus of deposit, it leaves
institutions in their current state of inertia. But if they
specifically stipulate IR deposit, they thereby immediately increase
the probability and the motivation for creating an IR as well as
adopting an institutional deposit mandate for the rest of the
research output of every one of the institutions that have a
researcher funded by that funder.


      - It is a tactical decision for OA supporters, knowing
      the local habits,
      to advertise ways of deposit to colleagues

But we already know that advertisement, encouragement, exhortation,
evidence of benefits, assistance -- none of these is sufficient to
get most researchers to deposit. Only requirements (mandates) work
(and you seem to agree).

Now institutions are the "sleeping giant" of OA, because they are the
universal providers of all of OA's target content. So to induce the
"sleeping giant" to wake up and mandate OA for all of his research
output, there has to be something in it for him (or rather them,
because the "sleeping giant" is in fact a global network of
universities and research institutions). What is in it for each of
them? A collection of its own institutional research output that it
can host, manage, audit, assess and showcase. What use is it to each
of them if their research output is scattered globally willy-nilly,
in diverse CRs? It increases the research impact of the institution's
research output, to be sure, but how to measure, credit, showcase and
benefit from that, institutionally, when it is scattered

Now, as noted, importation/exportation/harvesting can in principle
work both ways. But if a university that might wish to host its own
research assets has to go out and find and harvest them back from all
over the web, because they were deposited institution-externally,
instead of being deposited institutionally in the first place, the
time and effort involved is considerably greater than simply
mandating direct institutional deposit would have been -- and that
back-harvest does not even yield all of the university's output: only
whatever institutional research output happened to be funded by
funders that also mandate OA! Yet if those funders had mandated IR
deposit, all that work would already be done, and the university
would have a strong incentive to adopt a mandate requiring the rest
of its research output to be deposited too.

Meanwhile, for a mandating funder, harvesting the distributed IR
content of all of its fundees into a CR is far easier, as the
fulfillment conditions for the grant need only specify that the
author should send the funder the URL for the IR deposit of all
articles resulting from the grant. The rest can be done automatically
by software.


      - we have to make sure that people in charge of funding
      research (EU,
      National) do not oblige researchers to deposit in one
      specific place
      (their CR or any other)

On the contrary, there is every reason that funders should specify
the fundee's IR as the preferred locus of deposit, for the reasons
just adduced. Open-ended mandates are better than competing CR
mandates, but they are not nearly as good as convergent, synergistic
IR mandates (to help awaken the sleeping giant).

(As I was writing this posting, two new funder mandates have been
announced -- FRSQ in Canada and NRC in Norway: Both are welcome, but
both are open-ended about deposit locus, and consequently both miss
the opportunity to have a far greater positive effect on global OA
growth, by stipulating IR deposit.)
      - But I understand them, because when they ask
      researchers to give access
      to their work and advertise the fact that they have been
      paid by them,
      there is currently no practical way of doing it (labels
      put on deposit
      with the name of the program which gave the money, and
      harvesters able to
      compute this information ?)

Yes, precisely. Funding metadata can easily be added as a field in
the IR deposit software -- and institutions will be only too happy to
help in monitoring grant fulfillment conditions in this way, in
exchange for the jump-start it provides for the filling of their own

      - I also understand them because I feel that they want to
      add interesting
      tools (search, computation, meta-engine), tools which
      could be developped
      by central harvesters (CH). We are late on this issue and
      harvesters have
      not made much progress (see hereafter).

To repeat: Locus of direct deposit has nothing whatever to do with
harvester-level search. Search is not done at the IR level but at the
harvester (e.g., CR)  level.


      1) HAL and research evaluation
      3 years ago I tried to convince my former lab to open a
      sub-archive within
      HAL (same repository, but URL specific to the lab, with
      proper interface).
      I also tried to convince my university to have a general
      meeting with
      directors of local labs in order to invite them to do the
      same and, at
      another level, to manage the sub-archive in HAL for the
      university (a
      solution somewhere in between CR and IR). My colleague of
      the lab agreed,
      started the work but gave up because of lack of time. My
      university never
      answered to my proposal.

HAL is a nationwide resource that can in principle be used (much the
way the Web itself is used) to allow an institution to create and
manage its own "virtual IR". As such, HAL is partly a platform for
creating virtual IRs, rather than a CR.

So, essentially, what you and your colleague tried to do (and only
partly succeeded) was to create and manage an IR. That's splendid,
and welcome, but we already know that IRs alone are not enough.
Without a mandate, they idle at the usual 15% baseline.

(Please note that a lab repository is an IR.)


      Now, thanks to procedures for evaluating research in
      France, labs will have
      to choose the way they want to be evaluated (I mean the
      procedure to achieve it). Some software used by the
      national board will
      do the computation out of HAL. Consequently, my lab
      decided this week to
      urgently re-open and manage its sub-archive in HAL. Of
      course, the first
      thing they have to do is deposit of metadata. Actual
      deposit of
      corresponding papers is not mandatory. But they will take
      the opportunity
      to suggest to researchers to deposit as well their full

It won't work; it's been tried many times before. So this is a great
opportunity lost. As you see, the IR clearly languishes neglected
without a mandate. With a mandate -- particularly one in which
evaluation is based on what is deposited, as in Prof. Rentier's
mandate at Liège -- researchers perk up and deposit. But if all they
have to deposit is metadata, that's all they will deposit (even
though adding the full-text is just one more keystroke).

The reason is that the effect of mandates is mostly not coercive.
Researchers don't jump to deposit just because they are required to
deposit. They actually want to deposit, but they are held back by two
main constraints, one small, the other big: 

(1) The small constraint is ergonomic. Researchers are overloaded,
and they will not do something extra unless it really has a high
priority. A deposit mandate, especially one tied to funding and/or
evaluation, gives the few minutes-worth of keystrokes per paper
(which is all that a deposit amounts to) the requisite priority that
they otherwise lack.

(2) The big constraint is psychological: Researchers are
(groundlessly) afraid to deposit their papers (even the 63% for which
the journal already gives them its explicit blessing to do so) --
afraid until and unless their institutions and/or their funders tell
them they must, because then they know it is officially okay to do
so! The mandate unburdens their souls, and unlocks their fingers.


      Last thing : I do not mean that in France, only HAL
      should be used. We
      should make sure we have the choice to deposit where we

What France needs, like every other country, is funder and
institutional mandates converging on single-locus IR deposit
(irrespective of whether the IR is hosted by HAL). But if mandating
funders leave locus-of-deposit open, or insist on generic deposit in
some CR or other, the giant will keep hibernating, institutional
(departmental, laboratory) mandates will not be adopted, and what IRs
there are will continue to lie fallow.

      2) Harversters : advantages and current limits
      Just a personal experience. Till recently I used to
      advertise my list of
      publications by giving the URL of an open archive Edutice
      (a thematic one,
      VERY USEFUL in our domain, sub-part of HAL but with its
      local procedure,
      interface, etc.).
      Now I give to colleagues the OAISTER URL (with the path
      to follow) to get
      all my publications (because some of them are in other
      The problem is : deposits in Edutice appear twice in the
      OAISTER list (as
      deposits of Edutice and of HAL - but there is one only
      It is a concrete exemple of progress which should be made
      to avoid
      repetitions in harvesters (among many other new

If they had all been deposited in your own IR you would have had an
automatic listing of all your works (without duplications) through a
simple google IR site-search "chanier site:http-IRetc." -- and your
institutions would have it all too. And so would OAIster. And you
could have exported to Edutice with SWORD if you wished.

De-duplication and version-comparator software is already being
developed (though it's hardly worth it, when the problem is not the
presence of duplicates but the absence of even a singleton for 85%
global refereed research output) -- and that's what mandates in
general -- and convergent IR mandates in particular, to awaken the
slumbering giant -- are needed for.

Stevan Harnad


      ****************************** end of Thierry's message

      Le Mer 4 février 2009 22:12, Bernard Rentier a écrit :
> I agree. It is exactly what I was trying to say in my
      last paragraph :
> it is my belief that lauching a centralised and/or
      thematic repository
> (C-TR) can make sense, but only if it does not
      discourage authors from
> posting their publications in an institutional
      repository (IR),
> otherwise many publications will be lost in the process
      (I mean lost
> for easy and open access).
> In addition, direct posting in C-TRs will shortcut IRs
      and it will be
> a loss for universities in their attempt to  host their
> scholarly production (this is just a collateral effect,
      I know, but
> being a University President, it is a worry for me).
> C-TRs are of much more interest if they collect data at
      a secondary
> level by harvesting from primary IRs.
> Bernard Rentier
Received on Thu Feb 05 2009 - 21:51:20 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:39 GMT