Re: On Self-Selection Bias In Publisher Anti-Open-Access Lobbying

From: Stevan Harnad <amsciforum_at_GMAIL.COM>
Date: Wed, 18 Nov 2009 14:49:33 -0500

    [ The following text is in the "windows-1252" character set. ]
    [ Your display is set for the "iso-8859-1" character set. ]
    [ Some characters may be displayed incorrectly. ]

On Wed, Nov 18, 2009 at 1:15 PM, Pippa Smart wrote:

> Citation and impact are not easy to quantify as different studies have shown
> and therefore should not form the basis for arguing in favour of open
> access.

Citations (and downloads) are countable, and are counted. Research
usage and impact is not *synonymous* with citations or downloads, but
citations and downloads are certainly *measures* of research usage and

If OA generates more citations (and downloads), that is most
definitely a compelling basis for arguing in favor of OA.

(Indeed, no argument for OA could be more compelling than that OA
increases research impact; certainly not the argument that OA makes
journals more affordable, nor the argument that it makes accessible to
the general public peer-reviewed research journal articles most of
which the general public has no interest in reading [nor do most
peers!]. So every download and citation matters for this esoteric
content, written by specialists to be read, used, applied and
built-upon by specialists, for the sake of research progress, and
thereby for the benefit of the general public.)

> Intuitively if an article is made open access then it will have
> higher visibility and gain greater citation - but this is not necessarily
> true.

No one has said this is *necessarily* true. The empirical research on
the OA impact advantage and the "self-selection bias" is being
conducted to see whether it is true as a matter of empirical evidence.

> Studies have shown variable citation behaviour in which the access of
> an article appears to have no bearing. For example higher citation of the
> same article within different (higher "Impact Factor") journals (Vincent
> Larivière and Yves Gingras on

(1) Not every OA article will be cited more: only the ones that are
found useful enough to be citeable will. And the more useful an
article is, the greater the observed OA citation advantage. That is
why the empirical question about causality is: "Are OA articles more
likely to be cited because they are OA? Or are they more likely to be
OA if they are more cited (the self-selection bias)?

(2) 80% of citations are citations of the top 20% of articles.

(3) The top journals are both more likely to publish the top articles
and more likely to be cited.

(4) Gingras & Larivière are our co-authors on the study testing
mandated OA against self-selected OA that I mentioned (and that is
being submitted for peer review).

> and the "cluster-effect" of citations whereby authors follow citation trails
> laid by papers that they read resulting in a reduction in the number of
> articles being cited (James Evans in Science, 18 July 2008).

Larivière, V; Gingras, Y; & Archambault, E. (2008) The decline in the
concentration of citations, 1900-2007 ["This paper challenges recent
research (Evans, 2008) reporting that the concentration of cited
scientific literature increases with the online availability of
articles and journals"]

> I guess (as with all statistics) it is quite possible to find a study that supports
> one's point of view.

Yes, and that's called a self-selection bias. The remedy is properly
controlled studies and meta-analyses to determine where the
preponderance of the evidence lies (in the metaphoric, not the
mendacious sense!).

> I agree with Ian Russell that accusing publishers of "intensive lobbying" is
> inflammatory since both sides have formed lobbying bodies.

The difference is that OA lobbyists are not doing it for the money.

> Many publishers (commercial or not) are offering authors the opportunity
> to publish OA within their journals.

And to pay them a hefty price for doing it.

But what is under scrutiny here (the self-selection-bias hypothesis)
is not this generous offer on the part of some "Open Choice"
hybrid-Gold publishers, but the alternative, which is author Green OA
self-archiving, and whether that enhances citations, or is merely a
self-selective bias toward self-archiving the top articles.

> The current problem is that someone has to pay for
> the operation of scholarly communication, and there is no simplistic answer
> that will provide an overarching solution for all disciplines in all parts
> of the world - as much as both publishers and other lobbyists would like
> there to be.

"Scholarly communication" is being paid for, handsomely, today, by
institutional journal subscriptions. So that is definitely not the
"current problem." The problem is that not all the intended users for
which this research is being conducted can access it, because their
institutions can only afford to subscribe to a small fraction of the
peer-reviewed journal corpus.

That's the "current problem." And the -- yes, simple -- solution is
for researchers' institutions and funders to mandate that all their
own journal article output be made freely accessible online -- to all
its intended users (not just to the ones at the institutions that have
subscriptions to the journal in which it happens to be published) --
by ensuring that all authors self-archive their refereed final drafts
in their own institutional repositories immediately upon acceptance
for publication.

There are no disciplinary or geographic differences for the
peer-reviewed journal article corpus in this overarching solution --
neither in its benefits nor in its feasibility -- much though some
publishing lobbyists might wish there were.

> (And to pre-empt the response that repositories would provide the answer,
> no, I don't [think] they necessarily will for all disciplines and in all
> institutions, partly because they do not provide the content fiiltering and
> other valuable benefits that journals currently do, and partly because of
> the additional time/effort/expenditure required of libraries/institutions -
> some can easily meet the requirements, whereas others may not.)

Pippa, I think you may have missed the point about *what* is being
mandated for deposit in authors' institutional repositories: the
peer-reviewed final draft, immediately upon acceptance for

That core fact has been mentioned explicitly and frequently enough, I
should have thought, to pre-empt such a stupendous non-sequitur. But
-- to pre-empt another -- I rather suspect you are co-bundling another
unstated familiar [publishers' doomsday] hypothesis with your notion
of "repositories" [and OA mandates]: that they will destroy journals
and peer review. --

Don't worry. The peers -- the very same authors and users in question
-- do the peer-reviewing. The expenses of a reputable 3rd-party
honest-broker to continue implementing the peer review and certifying
its outcome with its (journal-) title and track-record for those
titles whose publishers prefer not to downsize to this more
parsimonious niche if and when the time comes will simply migrate --
title, track-record, editorial board, referees, authors, readers and
all -- to other (Gold) OA publishers, who will.

So don't worry about "content fiiltering and other valuable benefits
that journals currently do..."

And let OA advocates worry about (and see to the solution of) the real
"current problem": the needless continuing and cumulative daily,
weekly, monthly, yearly loss of research access, usage and impact
owing to access-denial to intended users.

Journals are doing just fine. It's just research access, usage and
impact that isn't.

Stevan Harnad

> Pippa Smart
> Research Communication and Publishing Consultant
> PSP Consulting
> 3 Park Lane, Appleton, Oxon OX13 5JT, UK
> Tel: +44 1865 864255
> Mob: +44 7775 627688
> Skype: pippasmart
> email:
> WEB:
> ****
> Editor of the ALPSP-Alert ( and Reviews editor of
> Learned Publishing (
> ****
> 2009/11/18 Stevan Harnad <>
>> [hyperlinked version of this posting:
>> Response to Comment by Ian Russell on Ann Mroz's 12 November 2009
>> editorial "Put all the results out in the open" in Times Higher
>> Education:
>> It's especially significant that Ian Russell -- CEO of the Association
>> of Learned and Professional Society Publishers (which, make no mistake
>> about it, includes all the big STM commercials too) -- should be
>> saying:
>> "It?s not 'lobbying from subscription publishers' that has stalled
>> open access, it?s the realization that the simplistic arguments of the
>> open access lobby don?t hold water in the real world... [with] open
>> access lobbyists constantly referring to the same biased and dubious
>> ?evidence? (much of it not in the peer reviewed literature)."
>> Please stay tuned for more peer-reviewed evidence on this, but for now
>> note only that the study Ian Russell selectively singles out as not
>> biased or dubious -- the "first randomized trial" (Davis et al 2008),
>> which found that "Open access [OA] articles were no more likely to be
>> cited than subscription access articles in the first year after
>> publication? -- is the study that argued that in the host of other
>> peer-reviewed studies that have kept finding OA articles to be more
>> likely to be cited (the effect usually becoming statistically
>> significant not during but after the first year), the OA advantage
>> (according to Davis et al) is simply a result of a self-selection bias
>> on the part of their authors: Authors selectively make their better
>> (hence more citeable) articles OA.
>> Russell selectively cites only this negative study, whose result is
>> more congenial to the publishing lobby, and selectively ignores as
>> "biased and dubious" all the positive (peer-reviewed) studies, as well
>> as thecritique of the study in question (as being based on too short a
>> time interval and too small a sample, not even replicating the effect
>> it was attempting to demonstrate to be merely an artifact of a
>> self-selection bias). Russell also selectively omits to mention that
>> even the Davis et al study found an OA advantage for downloads within
>> the first year -- with other peer-reviewed studies having found that a
>> download advantage in the first year translates into a citation
>> advantage in the second year (e.g., Brody et al 2006).
>> But fair enough. We've now tested whether the self-selected OA
>> advantage is reduced or eliminated when the OA is mandated rather than
>> self-selective. The results will be announced as soon as they have
>> gone through peer review. Meanwhile, place your bets...
>> Brody, T., Harnad, S. and Carr, L. (2006) Earlier Web Usage Statistics
>> as Predictors of Later Citation Impact. Journal of the American
>> Association for Information Science and Technology (JASIST) 57(8) pp.
>> 1060-1072.
>> Davis, PN, Lewenstein, BV, Simon, DH, Booth, JG, & Connolly, MJL
>> (2008) Open access publishing, article downloads, and citations:
>> randomised controlled trial British Medical Journal 337: a568
>> Harnad, S. (2008) Davis et al's 1-year Study of Self-Selection Bias:
>> No Self-Archiving Control, No OA Effect, No Conclusion.
>> Hitchcock, S. (2009) The effect of open access and downloads ('hits')
>> on citation impact: a bibliography of studies.
Received on Wed Nov 18 2009 - 19:51:03 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:59 GMT