Re: Central versus institutional self-archiving

From: (wrong string) Údon <>
Date: Mon, 4 Oct 2004 12:58:32 +0100

Here we go...

On Sat October 2 2004 08:16 am, Stevan Harnad wrote:
> On Fri, 1 Oct 2004, [identity deleted] wrote:
> > While OAI compliance is a sine qua non condition of some measure of
> > inter-operability, it does not (yet?) ensure the kind of ease of
> > retrieval that other forms of archiving can provide, including some form
> > of central archiving.
> This is incorrect.

It is entirely correct. Distributed archiving is bound by the limitations put
on the OAI protocol; a centralized archive is not bound by such limitations.
It is, therefore, easier for a centralized archive to make retrieval easy, in
any case easier than with a distributed system.

Just so I am not misread, I am not saying this to claim that we should forget
about distributed archives; I am saying this to respond to Stevan's
misgivings about some institutions or individuals supporting centralized
archiving. In the end, it does not really matter.
> This erroneous view that central archiving is somehow better or safer
> than distributed/institutional archiving is exactly analogous with older
> views that on-paper publication is somehow better or safer than on-line
> publication. The latter papyrocentric habit and illusion has happily
> faded, thanks mainly to the force of the example and experience with the
> growing mass of on-line content and usage. (But this obsolete thinking
> did not fade before it managed to delay progress for several years;
> nor has it faded entirely, yet!)

I have only argued that retrieval could be made easier in a centralized
archive than in a distributed archive by virtue of the simple fact that a
protocol such as OAI has to be kept simple. Therefore, compromises have to be
made which a centralized archive does not have to deal with. This has nothing
to do with "papyrocentric" - incidentally, I have never used papyrus myself,
only paper which, at worst, would make me paperocentric... - habits,
illusions or obsolete thinking.
> The instinctive preference for central over distributed archiving is a
> remnant of that same papyrocentric thinking ("the texts are safer and
> more tractable when they are all be in the same physical place") and will
> likewise fade with actual experience and more technical understanding. The
> trouble is that the preference (in both cases) is invariably voiced in
> contexts and populations that lack both the technical expertise and the
> experience with the newer, untrusted modality.

This has strictly nothing to do with my argument.
> And it always appeals to an uninformed audience that is a-priori more
> receptive to what more closely resembles the old and familiar than what
> resembles the new and less familiar, and that bases its sense of what is
> "optimal" not on objective experiment and evidence, but on subjective
> habit.

> The place to voice any doubts of uncertainties on technical questions
> like this is among technical experts with experience, such as the OAI
> technical group, not in the wider populace that is still naive and leery
> about the online medium itself, archiving, and open access.

> > Let us not forget that OAI-compliance may also lead to a mixing of
> > various levels of documents, for example some peer-reviewed, others not.
> The Eprints software includes the tag "peer-reviewed" and "not peer
> reviewed". This means documents can be "de-mixed" according to the metadata
> tags, as intended. In addition, the journal-name tag is an indicator. The
> old idea that physical location is the way to de-mix is obsolete in the
> distributed online era that the Web itself so clearly embodies.

So this means an extra-step in the retrieval technique and it must rely on
some degree of trust in all the registered depositories... Thank you, Stevan,
for demonstrating my point so clearly.

As for the rest of the paragraph, it is irrelevant.
> Moreover, the mixing of types of documents is a function of the archiving
> policy, not of the archive-type (institutional or central) or location.

Exactly what I said above: how do you trust the institutions to have the same
policies or the same rigor in applying them, if they are the same.
> Lastly, the inclusion of both peer-reviewed journal articles *and* both
> preprints and post-publication revisions and updates is a desirable
> complement, and can likewise be handled by various forms of pre-
> and post-triage using both the metadata and meta-algorithms based on
> metadata and full-text (de-duplication, dating and versioning at the
> harvester level).

> > because of this, the perception of archives that are only OAI-compliant
> > may not be entirely favorable. Scientists/scholars may not make much or
> > even any use of these sources simply because they consider them as too
> > "noisy" or worse.
> Are we then to recommend policy not on the basis of the actual empirical
> and technical facts, but on the basis of the prevailing "perception"? If
> we had adopted that strategy, we would have renounced the online medium
> itself a-priori, and renounced also the notion of Open Access! We are
> here to promote what is in fact optimal, not what is *perceived* to be
> optimal, according to existing habits and practices.

When you start asking institutions to do x, empirical and technical facts are
of the essence but they won't cut it alone. This may be to be regretted, but
it is a fact of life. Depositories, decentralized and centralized, are useful
and people such as Stevan and myself will push for them (or for some variety
of them, in the case of Stevan), and that is all right. However, while
pushing, it may also be useful to try and get a clear vision of where the
rest of the world stands and not try to multiply oppositions by disregarding
perceptions. This is what a savvy "world changer" tries to do.
> (Moreover, what is specifically at issue here is what form of
> self-archiving to *mandate* -- institutional or central for erstwhile
> non-self-archivers. This is an opportunity to guide and shape habits,
> rather than to be held back by them.)

> > Central (OAI-compliant) archiving is not mutually exclusive with
> > distributed, OAI-compliant archives; it simply completes and reinforces
> > the archival system that is being presently explored and experimented
> > with.
> That is entirely correct, and is one of the premises of OAI-compliance and
> interoperability: *All* forms of archiving are in fact forms of distributed
> archiving and are made interoperable and equivalent by OAI-compliance. So
> no one has said central archiving lacks any of the functionality of
> institutional archiving.

But I say that distributed institutional archiving may lack some of the
functionalities that can be implemented on a centralized archive (again, just
to make things crystal clear, I am not against distributed archives; I simply
object to Stevan's phrase: " It is ever so important that the Research
Councils implement the *UK* recommendation -- to mandate institutional
self-archiving of all UK journal-article output -- and not be drawn (as I
alas heard some hint of at the eurocris meeting) into the much less
focussed,thought-through, and optimized US/NIH notion of mandating central
archiving." If Stevan claims (rightly) that both forms of archiving are
equivalent, why spend so much time on this issue? One way to delay matters is
to pounce on every little detail that does not fit one's own vision and
logic. One way to get people weary of one's arguments is to behave exactly in
this manner.
> The reason I (and others) are coming out so strongly in favour of
> institutional archiving is hence not *functional*, since we fully
> understand how and why both forms of archiving are functionally equivalent
> (and indeed it is the advocates of central archiving that often fail to
> grasp this, and argue on the basis of putative functional advantages of
> central archiving that do not in fact exist).

Except for the possibility of creating retrieval functionalities that exceed
the admittedly limited scope of the oAI meta-data.
> The reason I (and others) are coming out so strongly in favour
> of institutional archiving has to do with the probability of OA
> content-provision itself, i.e., the probability that the OA archives will
> be filled, rather than lie fallow, as well as the closely connected
> probability that archive-filling will propagate across fields and
> archives, rather than be restricted to just one field and archive. (It
> also has a little to do with distributing the archiving burden and costs,
> but that is not the primary reason.)

If that is not putative...
> The authors of the annual 2.5 million articles that we would all like to
> see self-archived as soon as possible are virtually all affiliated with
> institutions of their own (universities or research institutions). They
> also each have disciplines, but author/institution is the relevant pair
> here, not author/discipline: not just because disciplines are nebulous
> entities or because few disciplines have central archives and creating and
> maintaining them is a much more nebulous matter, but because it is authors
> and their respective institutions (not authors and their disciplines),
> that share a common stake in maximizing the access to and impact of their
> (joint) research output -- not authors and their respective disciplines
> (which are, if anything, a locus of competition for impact, rather than
> being its joint co-beneficiaries).

My take on this is quite simply that while institutions are best placed to
organize archives, they should immediately network with peer institutions to
create critiacl masses in various disciplines; moreover, they should make
these repositories gradually evolve into overlay journals so as to give clear
signals of added value that would in turn really convince scientists and
scholars that there is a concrete advantage in having one's articles located
within OA repositories. Relying exclusively on institutional repositories,
OAI compliance and mandating is not realistic IMHO. The latter, in particular
(mandating) is really political in nature and I am not sure Stevan is
terribly gifted for that kind of work.
> Moreover, institutions (particularly universities) also share most or
> all of the disciplines. So when a self-archiving policy or practice is
> adopted by an institution at all, the probability is very high that it will
> also propagate across all of that same institution's
> disciplines. Moreover, as institutions (and not disciplines) are in
> competition with one another for visibility and impact, the probability
> is also high that if some institutions adopt the policy and practice of
> self-archiving, this will also propagate across (competing) institutions.

If that is not putative...
> In addition, institutions, being the employers of their researchers and
> the co-beneficiaries of research impact, are in a position to mandate,
> monitor and reward compliance with an institutional self-archiving policy
> (through employment, salary, promotion, tenure, etc.).

And the faculty will just comply? If that is not putative...
> Disciplines have neither the interest nor the wherewithal to mandate,
> monitor and reward central self-archiving. Neither do Learned Societies.

> The one prominent and valuable non-institution-based exception is
> research-funders, whether discipline-based or national/international and
> pan-disciplinary: Research-funders too have an interest in maximizing
> the access to and impact of the research they fund, and are hence in a
> position to mandate, monitor and reward self-archiving.

True. Very true.
> However -- and this is a critical point, particularly with the US/NIH
> self-archiving mandate -- research-funders can mandate, monitor and reward
> self-archiving either way: They can mandate central self-archiving, as
> the current version of the US/NIH recommendation does, or they can mandate
> institutional self-archiving, as the UK recommendation does. The effect,
> for the specific funded research itself, is exactly the same. The critical
> difference is in the probability of propagation *beyond* the specific
> funded research in question, toward the 100% OA that we are all seeking.

I do not understand. Why should OA archiving spread easily inside an
institution, but not do the same once some central archives shows results?
> Having established that institutional and central archiving are
> functionally equivalent, and that research-funders can equally well
> mandate, monitor and reward self-archiving on either a central or
> an institutional basis, the only relevant question is: Which of these
> otherwise completely equivalent means is more likely to yield more OA?
> And the answer is unequivocal: mandating institutional self-archiving,
> according to the UK recommendation, rather than central self-archiving,
> according to the US recommendation.

> Yes, there is some probability that discipline-based central-archiving
> mandates by research-funders will propagate across disciplines and
> research-funders too (and they no doubt will). But that propagation is
> just as likely (and will in fact occur far more readily of its own accord)
> if each discipline and research-funder does not need to create and fund
> and maintain a central archive of its own, but can distribute that load on
> the institutional OAI-archiving network, which is already growing because
> of the self-archiving mandates of both prior research-funders and of
> institutions themselves, and is already propagating across the disciplines
> within each institution, and across institutions.

If this is not putative...
> (Institutional self-archiving, by the way, is actually distributed
> locally too, with departments administering and monitoring compliance
> in their own sectors: that is part of the beauty and functionality of
> OAI-interoperability -- as well as of the modular OAI-compliant software:
> "Institutional" self-archiving should really be called
> "Institutional-Departmental" self-archiving.)
> A forthcoming analysis by Rowland & Swan commissioned by the UK Joint
> Information Systems Committee (JISC) has come out decisively in favor of
> distributed institution-based self-archiving over central self-archiving,
> for a variety of reasons, including both functional and economic ones
> based on efficiency, cost and ease of implementation and monitoring,
> as well as strategic reasons based on institutional research culture and
> probability and ease of compliance. (I will circulate the URL of that
> report as soon as it is released.)

Fine, but why spend so much time on this, once again?
> > Consequently, it does not make much sense to focus on this issue. Simply
> > let archives flourish wherever they may and in whatever form.
> On the contrary, it makes a great deal of sense to focus on this issue,
> to try to understand it, and to try to guide policy and implementation
> in the direction that is likely to maximise the propagation of OA
> self-archiving across disciplines and institutions, rather than to
> minimize it:

The difference is putative at best and not demonstrated. The two forms of
archiving are not mutually exclusive, to repeat myself. Implementing one form
somewhere probably corresponds to what is easiest in that particular
circumstance or context; insisting that the alternative should be implemented
there will probably slow things down. At least, this is what a pragmatist
might argue.
> The US central self-archiving mandate will certainly generate OA for
> NIH-funded biomedical research. But why not, for the same money and
> mandate, generate so much more OA, by simply dropping the stipulation that
> the self-archiving must be central (in PubMed Central), and instead
> let the self-archiving propagate naturally across institutions and their
> disciplines? This OA maximization is attainable at no functional cost
> or sacrifice whatsoever. All it requires is a small parameter change
> that will confer huge benefits.

The same money will not spread across institutions. The money NIH is willing
to devote to one central archive will not be given to universities to create
their institutional repositories for all disciplines.
> > If some institutions seem to feel more at ease with the presence of some
> > centralized archives, so be it, so long as they do not object to the
> > parallel development of institutional, disciplinary or even individual
> > archives.
> I could not follow the logic of this. (It seems to confound two senses
> of the word "institution"). As far as I know, no one has spoken about
> what institutions do or not feel "at ease" with (and most individual
> sentiments of "ease" here are more about ease with what individuals are
> accustomed to, rather than about what is actually optimal, either for
> the individuals, their institutions, or OA). The issue concerns what form
> of mandated OA self-archiving is likely to generate the most OA, soonest.

Institutions here means a variety of organizations ranging from granting
agencies to universities and departments. The phrase "feeling at ease with"
obviously is a metaphor for institution x favoring solution y. It has nothing
to do with being "accustomed to". The problem with your thesis Stevan appears
clearly in the sentence above: you really believe there is one and only one
best solution for OA. You also believe you know this one and only one
solution for all cases, institutions, disciplines, etc. And you spend an
awful lot of time defending the rightness of your vision rather than
defending OA more broadly (and tolerantly). This is a waste of time IMHO.
> The concerned parties seem to be the following: (1) NIH, which is a
> central (national) research-funding agency, which is also associated with
> (2) NLM, which has a superb and invaluable central index for abstracts
> and links across all of biomedicine, PubMed, and which also has associated
> with it a small but useful and growing central OA Archive, PubMed Central
> (PMC).
> The US Congress is considering making it law that NIH should mandate
> the self-archiving of all NIH-funded research in PMC. The self-archiving
> mandate for NIH-funded research is extremely desirable and welcome. The
> point under discussion here is that by changing one small parameter in
> the mandate -- namely, to require only that the research be self-archived
> in an OAI-compliant OA archive, without stipulating that it must be PMC
> -- the very same NIH mandate can and will generate far, far more OA,
> naturally propagating of its own accord across institutions and their
> disciplines. The functionality will be identical (and PMC can easily
> and automatically harvest all the NIH-funded institutional metadata if
> it wishes, as well as to serve as a backup OAI archive for the full-texts
> if an author's institutions does not yet have an OAI archive).
> The UK mandate (if/when implemented) is already optimal in this regard.
> Stevan Harnad
> A complete Hypermail archive of the ongoing discussion of providing
> open access to the peer-reviewed research literature online (1998-2004)
> is available at:
> To join or leave the Forum or change your subscription address:
>html Post discussion to:
> UNIVERSITIES: If you have adopted or plan to adopt an institutional
> policy of providing Open Access to your own research article output,
> please describe your policy at:
> BOAI-2 ("gold"): Publish your article in a suitable open-access
> journal whenever one exists.
> BOAI-1 ("green"): Otherwise, publish your article in a suitable
> toll-access journal and also self-archive it.
Received on Mon Oct 04 2004 - 12:58:32 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:36 GMT