Re: A response from the Wellcome Trust on Conflating OA Repository-Content, Deposit-Locus, and Central-Service: a

From: Stevan Harnad <>
Date: Thu, 10 Dec 2009 15:10:00 -0500

Many thanks to Robert Kiley of the Wellcome Trust (WT) for responding
to my recommendations on optimising the Trust's Open Access Mandate,
but unfortunately Robert only repeats points with which I am already
very familiar, while passing in silence over the actual substantive
points I have raised, repeatedly, ever since the Wellcome Trust
mandate was adopted 5 years ago (and even earlier than that).

Let me summarise the (many) positive aspects of the Wellcome Trust
Mandate before specifying, once again, the negative aspects that can
so easily be fixed.


(1) The WT OA mandate five years ago (2004) was the world's first
funder mandate and helped to inspire many others.

(2) The WT OA Mandate not only came earlier than the NIH policy, but
it was a mandate (requirement) from the very outset, whereas the NIH
policy lost 2 years by being initially formulated as a request rather
than a requirement.

(3) The WT in general (and Robert Kiley and Robert Terry in
particular) have worked valiantly and tirelessly to promote OA and OA
mandates during the ensuing 5 years.


(1) The WT OA Mandate stipulates direct deposit in PubMedCentral (PMC)
instead of institutional deposit and central harvesting; this
counterproductive constraint has since been imitated by other funders
following WT's example. Institutions are the universal providers of
all OA output, funded and unfunded, across all disciplines. If funders
mandate institutional deposit, they encourage and reinforce universal
adoption of institutional OA mandates (and gain a powerful ally in
monitoring and ensuring compliance); if funders instead mandate
central deposit, they discourage and compete with universalizing the
adoption and implementation of institutional mandates.

(2) The WT OA Mandate permits a delay (embargo) of deposit for up to a
year after publication. If funders instead mandate immediate
institutional deposit, with no exceptions, the institutional
repository's "author email eprint request" Button can provide Almost-
OA to would-be users while access to the deposit is embargoed;
otherwise researcher access, usage and impact are needlessly lost
during the embargo.

(3) The WT OA Mandate allows the option of publishers doing the PMC
deposits in place of WT's fundees. This not only makes fundee
compliance vaguer and compliance-monitoring more difficult, but it
further locks in publisher embargoes (with no possibility of authors
providing Almost-OA to tide over user needs during the embargo period)
and further discourages convergent institutional mandates.

All three of these dysfunctional implementational details can be
easily and fully remedied simply by specifying that deposit should be
in the fundee's IR (or, if the fundee's does not yet have an IR, in an
interim IR such as DEPOT) immediately upon acceptance for publication.
That's all; and the negatives are thereby immediately nullfied and the
WT funder mandate becomes the optimal model for adoption by other
funders, as well as a strong impetus to the adoption of complementary
deposit mandates by institutions.

Now I reply to Robert's responses:

On 9-Dec-09, at 12:20 PM, Robert Kiley (Wellcome Trust) wrote:

> Stevan
> Your last post made a number of comments about PMC, UKPMC and the
> Wellcome Trust, which I'd like to respond to:
> 1. Empty repositories?
> [1a] You indicate that "PMC (or its emulators)" are bereft of content.
> This is not the case. PMC currently has around 1.9 million full-text
> articles. [1b] Looking at the compliance levels for funder mandate,
> we see
> that around 43% of Trust funded research is made available through
> UKPMC in line with our mandate. We are actively working to increase
> this figure. [1c] It is also worth noting that, because of our
> support for
> gold OA, a significant proportion of these articles are freely
> available at the time of publication.

[1a] Let me define an "empty repository": a repository that captures
0-15% of its total annual target content.

Why? Because 15% is the default baseline for spontaneous, unmandated
deposit. You are not doing better than the default baseline if you are
not capturing significantly more than 15% of your repository's total
annual target content.

What percentage of the global annual output of peer-reviewed bio-
medical journal articles -- per year -- do you think that PMC's grand
total of 1.9 million articles represents?

It's only that (annual) figure (minus 15%) that tells you how non-
empty a repository is, not the grand total (and certainly not the
grand total for a central repository whose denominator (the annual
amount by which you must divide the annual deposits to calculate the
percentage) consists of all annual biomedical research on the planet
(or even all annual biomedical research originating from the US).

This is what I called the "denominator fallacy" in my prior posting.

[1b] In contrast, PMC is capturing 43% of WT's target content in 2009.
That's certainly better than 15% (or NIH's meagre 5% before they
upgraded their deposit request to a requirement). But that's mandated

And 5 years after the adoption of the WT mandate, 43% isn't really
that good either. And it was only last year that WT itself was
expressing concern about its low compliance rate:

In contrast, institutions that have adopted and implemented deposit
mandates are doing a good deal better than that: Over 60% and well on
the road to 100% about 2 years after adoption.

And the reason for the successful institutional mandates' success is
quite evident: Institutions are the quotidial employers of their
researchers, not just their occasional funders. And they have annual
performance reviews for salary, promotion and tenure. Institutions are
in a position to mandate -- as the University of Liege has notably
done -- that the mechanism for submitting one's annual publications
for performance review is henceforth to deposit them in their IR;
otherwise the publications will be invisible. This is a simple
internal bureaucratic requirement, rather like the ubiquitous
transition from submitting on paper to submitting online.

Institutions, as we all know, are also very eager that their
researchers should receive research funding. Hence institutions are
eager to be involved in helping researchers prepare grant applications
as well as to ensure that they fulfill all grant requirements if
funded. Fundees' institutions are hence funders' natural allies in
ensuring and monitoring compliance with the funder's deposit mandate
-- as long as the designated deposit locus is institutional. Moreover,
funders mandating institutional deposit of the articles resulting from
the research they fund, and institutions' involvement in ensuring
compliance, also encourages institutions to go on and mandate deposit
of the rest of their research output too.

But if it is instead stipulated by the funder -- and I have to repeat
this each time: for no good reason at all, since it confers no
advantages whatsoever, either functional or practical, over
institutional deposit, only disadvantages -- that deposit must be
central, then the fundee's institution is in no better position than
the funder to ensure and monitor compliance. In addition, the
institution then has the opposition of its researchers to contend
with, if ever the institution contemplates adopting a deposit mandate
of its own: Researchers (quite understandable, and justifiably) do not
want to have to deposit willy-nilly in divergent multiple loci,
institution-internal as well as institution-external. Add to that the
further confusion added by the fact that fundee "compliance" can be
fulfilled by *publishers* depositing in PMC instead of fundees, and
after a one-year embargo, and you have both grant fulfillment
conditions and mandate incentive conditions as ill-served and hard to
monitor as they could possible be.

And, again, for no good positive reason whatsoever.

[1c] Yes, the WT money that could have been spent on supporting more
research, when it is instead redirected to paying for Gold OA
publication, does increase the uptake of Gold OA somewhat. But is that
the objective? Or is the objective rather to increase OA as much as
possible -- which is what the Green OA deposit mandate itself would
do, if compliance were indeed insured and monitored.

As to the best way to contend with the 1-year embargo at this point --
that's up to WT to decide. 63% of journals already endorse making
institutional deposits OA immediately upon publication. If WT finds it
a better use of its research money to pay for immediate Gold OA for
the remaining 37% (rather than relying on the Institutional
Repositories' "email eprint request" Button to allow the author to
provide almost-immediate, "Almost OA" during the embargo), that's a
judgment call. But it's not an argument for insisting on central
deposit rather than institutional.

Note, though, that WT is on the side of the angels in having mandated
OA already, rather than just offering to subsidize Gold OA. The
trouble is that the "mandated Green OA deposit plus subsidized Gold OA
option" policy is far less generalizable, for example, for poorer
funders, or funders more anxious to use their scarce funds to fund
more research rather than to subsidize Gold OA publishing. This is
especially today, when OA can be had without cost, by mandating Green
OA and just letting subscriptions continue to pay for publishing. And
this remains true until/unless Green OA ever makes subscriptions no
longer sustainable. Then (and only then) a transition to Gold OA will
be payable out of institutions' windfall subscription cancellation
savings -- and for a lot less than today's Gold OA's pre-emptive sking
price, since the only thing left to pay for then will be peer review
-- without the need to syphon away any additional research money.

Moreover, the example of pre-emptive payment for Gold OA has inspired
another nonstarter, from funders and institutions that are not yet on
the side of the angels: They are redirecting scarce research or
institutional funds today, needlessly, to pay for Gold OA today
without even mandating OA, as WT has done. That's the worst of all
possible worlds (and encouraged by the example of needless and
ineffectual profligacy on the part of others, even when they do couple
it with a Green OA mandate too...)

> 2. "National PMC's are a joke"
> The idea that UKPMC is a "joke" is not shared by the Wellcome Trust,
> or indeed the other 7 biomedical research funders in the UK who have
> established (and funded) this repository.
> You suggest that UKPMC simply holds UK content. This is not the case.
> It holds **all** the PMC content, but each "locality" (UK, Canada
> etc) can build services in top of that to ensure that it meets the
> needs of the research community we are trying to serve.

That's all splendid, and not the joke at all.

The joke is the notion that all these countries need a national PMC as
the place to mandate deposit!

Of course all manner of harvesting services can be superadded to any
number of harvested collections -- national, disciplinary or what have
you. That's not the joke. The joke is that national funders are
slavishly adopting the wrong-headed notion that they, like NIH/PMC,
need their own national, central place to deposit their mandated
contents -- instead of doing what NIH/PMC should have done in the
first place (and should convert as soon as possible to doing now),
which is to mandate institutional deposit, and harvest/import from
there to any central collections or services they may wish to provide.

> In January 2010 we will be launching a new UKPMC site which will
> offer users:
> A) A single access point to around 20 million bibliographic records -
> drawn from PubMed, Biological Patents, Agricola and Clinical
> Guidelines databases - as well as the 1.5m+ full text articles in

Splendid. But now please explain to me why the worthy and welcome goal
of offering *users* a single access point for all these worthwhile
contents needs to be reached by requiring UK-funded *authors* to
deposit in UKPMC to fulfill their deposit mandates, rather than in
their own IRs?

(And I do hope you won't reply that it's in order to accommodate the
publishers, who need to deposit in UKPMC! Those articles are by UK
fundees too! Let those fundees simply, and uniformly, deposit all
their (mandated) articles in their own IRs, regardless of whether they
are published in paid Gold OA journals, free Gold OA journals,
subscription journals with OA embargoes, or subscription journals
without OA embargoes: One size fits all, funders and institutions
alike, across nations as well as disciplines, for both funded and
unfunded research: Deposit institutionally.

> B) Additional, local content. This includes guidelines from NICE and
> other NHS bodies, plus relevant (i.e. biomedical) theses derived from
> EthOS. So, by way of example, when you search the new site for say
> "management of stroke" you will be presented with relevant PubMed
> citations, full-text articles, UK clinical guidelines etc in one
> search.

All very valuable stuff -- but nothing in this is contingent on
mandating central deposit. Harvesting of distributed content is the
name of the game, in the online era. (We don't deposit directly in
google either. Google harvests distributed locally deposited content.)

> C) New citation services. For every article (be it full text or just
> the bibliographic citation) you will be able to see all the papers
> which that paper cited, as well as all the other articles which cite
> that paper.

Lovely, stuff but nothing to do with the only point at issue here,
which is whether or not mandating funders have any good reason to
require divergent central deposit instead of convergent institutional
deposit. (The latter might even help accelerate the institutional
mandates you'll need to turn those bibliographic citations into full-
texts -- at least for living authors...)

> D) New text-mining services. Our colleagues at EBI and NaCTeM have
> build tools to textmine the content in UKPMC. In the first release
> (January 2010) users will be able to see in a "summary box" which will
> provide details of what genes/proteins, organisms etc are discussed in
> the paper they are viewing. Over the next 18 months this textmining
> functionality will be developed further in include chemical compounds,
> disease names etc etc.

Again very valuable, and again completely orthogonal to the question
of locus of deposit -- which, to repeat, is the only one I keep
banging on about.

(If funders wish to mandate deposit in specific formats, such as XML,
they can do that equally well regardless of locus of deposit -- though
I would not myself recommend over-constraining format requirements at
this early stage, when it is the articles that are missing and sorely
needed, rather than the documents already being accessible, and only
the right format being sorely needed. And if, in contrast, the deposit
tagging and format are being enriched by some other central service,
rather than the author, that too can be done irrespective of locus of
deposit, again through central importation or harvesting.)

> The "franchise" model that PMC uses is akin to that developed for the
> human genome project inasmuch as content is mirrored to a number of
> sites (e.g. NCBI, Sanger, and DDBJ) but each centre develops their own
> interface to this content. So, the core content at PMC, UKPMC and PMC
> Canada is identical -- but each centre will develop their own
> valued-added services.

The "franchise" model is equally compatible with central deposit and
with distributed institutional deposit and central harvesting...

> The UKPMC Funders Group - led by the Wellcome Trust - are, with the
> support of European partners, exploring the possibility of creating a
> single, Europe-wide OA repository for peer reviewed biomedical
> research papers -- a Europe PMC. A workshop to discuss this is taking
> place on the 2nd December at the Berlin 7 meeting. [See:

All these collections and re-collections of biomedical research papers
and services are welcome, but have nothing to do with mandated deposit

> 3. Why the Trust favours the author-pays model
> ***
> The Wellcome Trust has always argued that:
> A) dissemination costs are simply research costs
> B) publishers add-value to the research article

Fine. Call the costs what one may: those publication costs and values
are being paid for in full by subscriptions today. What is missing is
access to those publications (for those whose institutions can't
afford the subscription costs). Green OA provides that access, in
full. And mandates provide Green OA, with no extra cost. It's up to WT
if they want to spend more research money on reforming publishing,
rather than just providing the access that is missing. But let that
not be mistaken or misrepresented as the fastest, cheapest or surest
way to provide the missing access. It is simply using research money
to try to reform publishing.

Nor can WT represent favouring the payment for Gold OA with scarce
research funds over providing Green OA at no extra cost as something
that favours OA: It does not. It simply diverts research money to pay
pre-emptively for Gold OA, when it is not even needed; it disfavours
the cost-free Green way of providing OA; and it sets an unfortunate
example for other funders contemplating what they can do to increase OA.

> It follows, therefore, that these costs have to be met -- and that is
> what the Wellcome Trust (and others) do.

It only follows that those costs have to be met if there is also a
reason why research money has to be spent on reforming publishing
today, when what is really needed today, urgently, is more research
access, not less research money, nor publishing reform.

(Publishing reform will be needed, and will happen, if and only if and
when universal Green OA makes subscription journal publishing
unsustainable. But if and when universal Green OA ever does so, it
will, by the very same token, also release the subscription
cancellation funds to pay for Gold OA without the need to redirect
scarce research funds. Indeed, universal Green-OA-driven subscription
collapse will also force journal publishing to cut obsolete products
and services (such as the paper edition, the online edition, access-
provision and archiving) and their associated costs, downsizing to
just the service of peer review. The distributed network of
institutional repositories (and any harvester services thereover) will
do the access-provision and archiving. So instead of receiving less
research funding, researchers' institutions will enjoy a surplus from
their annual windfall subscription cancellation savings.

> It is also worth pointing out that when an APC fee is met, the Trust
> requires the publisher to provide a number of services:
> A) Deposit the final version - marked up to an XML standard - directly
> into PMC, where it must be made freely available at the time of
> publication. [So, no embargo, no "email request buttons", and no work
> on behalf of the author.]

If you offered your fundees the choice (without fear or favour) of
spending the WT research money on research or spending it to spare
themselves the few keystrokes it takes to deposit their postprints
(63%) and fulfill email eprint requests (37%), do you have any doubt
as to what choice they would make? Especially if the designated locus
of deposit were institutional, and hence they were already depositing
their unfunded research that way...

> B) Attach a licence to these articles, thus ensuring that anyone who
> want to re-use the work (e.g. text-mining, creating translations,
> re-using for different audiences etc) can do so. Whether such rights
> extend to author manuscripts is, at best, unclear.

More important, those rights and re-uses are completely superfluous.
What's urgently needed (and prominently missing) today is online
access to the articles, free for all. What comes with that territory
is the capability of any user to search, link, read online, download,
print-off, and data-crunch a personal copy. In addition, harvesters
like google can and will harvest and invert it. "Different audiences"
can use the same URL. Translations (for the lucky few where it's
wanted) can, as always, be handled on a case by case basis.

Let's talk again about any "text-mining" beyond this when there's
enough OA text to make it worth talking about.

> C) These articles can also be included in the OA subset, thus allowing
> institutions (and others) to harvest, via OAI, relevant full-text
> content.

That sentiment is not unworthy of Marie Antoinette! "Let the
institutions harvest back their very own content, because we have
elected to mandate that it must be deposit institution-
externally." (Harvesting, for the record, is something central
harvesters do over distributed providers of the content, not the
reverse, i.e., not distributed providers of the content, harvesting it
back from an institution-external central deposit locus where their
own content providers have been required to deposit it, instead of
depositing it institution-internally in the first -- and only --
place. (That's like saying, let everyone deposit their content in
google, and then harvest it back if they want it locally.)

SUMMARY: Not one substantive reason has been given for WT's continuing
insistence on central deposit rather than institutional deposit (and
central harvesting). Nor has a compelling reason been given for
favouring paid Gold OA over free Green OA (but if WT mandated
institutional deposit, this would become a minor matter, because as
more institutions added their institutional mandates to WT's and other
funders' mandates, the absurdity -- and non-scaleability -- of paying
pre-emptively for Gold OA today, rather than just depositing for Green
OA at this time -- while the potential funds to pay for Gold OA are
still locked into subscriptions that are paying for subscription
publication in full -- would become more and more obvious. The
confusion and uncertainty about this today are simply a result of the
extreme sparseness of OA content -- whether Green or Gold -- today [c.
15%], as well as the extreme rarity of OA mandates [c. 100/10,000].)

Best regards,

Stevan Harnad

> Best regards
> Robert Kiley
> Wellcome Trust
Received on Thu Dec 10 2009 - 20:32:26 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:58 GMT