Re: Whether Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Thu, 7 Jan 2010 14:34:49 -0500

On 7-Jan-10, at 6:50 AM, Philip Davis wrote:

> An interesting bit of research, although I have some methodological
> concerns about how you treat the data, which may explain some
> inconsistent and counter-intuitive results, see:
> http://j.mp/8LK57u
> A technical response addressing the methodology is welcome.
> Philip M. Davis
> PhD Student
> Department of Communication
> 301 Kennedy Hall
> Cornell University, Ithaca, NY 14853
> email: pmd8_at_cornell.edu
> phone: 607 255-2124
> https://confluence.cornell.edu/display/~pmd8/resume
> http://scholarlykitchen.sspnet.org/author/pmd8/

Thanks for the feedback. We reply to the three points of substance, in
order of importance:

(1) LOG RATIOS: We analyzed log citation ratios to adjust for
departures from normality. Logs were used to normalize the citations
and attenuate distortion from high values. This approach loses some
values when the log tranformation makes the denominator zero, but
despite these lost data, the t-test results were significant, and were
further confirmed by our second, logistic regression analysis. Moed's
(2007) point was about (non-log) ratios that were not used in this
study. We used the ratio of log citations and not the log of citation
ratios. When we compare log3/log2 with log30/log20, we don't compare
percentages with percentages (60% with 14%) because the citation
values are transformed or normalized: the higher the citations, the
stronger the normalisation. It is highly unlikely that any of this
would introduce a systematic bias in favor of OA, but if the referees
of the paper should call for a "simpler and more elegant" analysis to
make sure, we will be glad to perform it.

(2) Effect Size: The size of the OA Advantage varies greatly from year
to year and field to field. We reported this in Hajjem et al (2005),
stressing that the important point is that there is virtually always a
positive OA Advantage, absent only when the sample is too small or
the effect is measured too early (as in Davis et al's 2008 study). The
consistently bigger OA Advantage in physics (Brody & Harnad 2004) is
almost certainly an effect of the Early Access factor, because in
physics, unlike in most other disciplines (apart from computer science
and economics), authors tend to make their unrefereed preprints OA
well before publication. (This too might be a good practice to
emulate, for authors desirous of greater research impact.)

(3) Mandated OA Advantage? Yes, the fact that the citation advantage
of mandated OA was slightly greater than that of self-selected OA is
surprising, and if it proves reliable, it is interesting and worthy of
interpretation. We did not interpret it in our paper, because it was
the smallest effect, and our focus was on testing the Self-Selection/
Quality-Bias hypothesis, according to which mandated OA should have
little or no citation advantage at all, if self-selection is a major
contributor to the OA citation advantage.

Our sample was 2002-2006. We are now analyzing 2007-2008. If there is
still a statistically significant OA advantage for mandated OA over
self-selected OA in this more recent sample too, a potential
explanation is the inverse of the Self-Selection/Quality-Bias
hypothesis (which, by the way, we do think is one of the several
factors that contribute to the OA Advantage, alongside the other
contributors: Early Advantage, Quality Advantage, Competitive
Advantage, Download Advantage, Arxiv Advantage, and probably others). http://openaccess.eprints.org/index.php?/archives/29-guid.html

The Self-Selection/Quality-Bias (SSQB) consists of better authors
being more likely to make their papers OA, and/or authors being more
likely to make their better papers OA, because they are better, hence
more citeable. The hypothesis we tested was that all or most of the
widely reported OA Advantage across all fields and years is just due
to SSQB. Our data show that it is not, because the OA Advantage is no
smaller when it is mandated. If it turns out to be reliably bigger,
the most likely explanation is a variant of the "Sitting Pretty" (SP)
effect, whereby some of the more comfortable authors have said that
the reason they do not make their articles OA is that they think they
have enough access and impact already. Such authors do not self-
archive spontaneously. But when OA is mandated, their papers reap the
extra benefit of OA, with its Quality Advantage (for the better, more
citeable papers). In other words, if SSQB is a bias in favor of OA on
the part of some of the better authors, mandates reverse an SP bias
against OA on the part of others of the better authors. Spontaneous,
unmandated OA would be missing the papers of these SP authors. http://www.eprints.org/openaccess/self-faq/#29.Sitting

There may be other explanations too. But we think any explanation at
all is premature until it is confirmed that this new mandated OA
advantage is indeed reliable and replicable. Phil further singles out
the fact that the mandate advantage is present in the middle citation
ranges and not the top and bottom. Again, it seems premature to
interpret these minor effects whose unreliability is unknown, but if
forced to pick an interpretation now, we would say it was because the
"Sitting Pretty" authors may be the middle-range authors rather than
the top ones...

Yassine Gargouri, Chawki Hajjem, Vincent Lariviere, Yves Gingras, Les
Carr, Tim Brody, Stevan Harnad

Brody, T. and Harnad, S. (2004) Comparing the Impact of Open Access
(OA) vs. Non-OA Articles in the Same Journals. D-Lib Magazine 10(6). http://eprints.ecs.soton.ac.uk/10207/

Davis, P.M., Lewenstein, B.V., Simon, D.H., Booth, J.G., Connolly,
M.J.L.
(2008) Open access publishing, article downloads, and citations:
randomised controlled trial British Medical Journal 337:a568 http://www.bmj.com/cgi/reprint/337/jul31_1/a568

Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year Cross-
Disciplinary Comparison of the Growth of Open Access and How it
Increases Research Citation Impact. IEEE Data Engineering Bulletin
28(4) 39-47. http://eprints.ecs.soton.ac.uk/11688/

Moed, H. F. (2006) The effect of 'Open Access' upon citation impact:
An analysis of ArXiv's Condensed Matter Section Journal of the
American Society for Information Science and Technology 58(13)
2145-2156 http://arxiv.org/abs/cs/0611060

> Stevan Harnad wrote:
>> Self-Selected or Mandated, Open Access Increases Citation Impact for
>> Higher Quality Research
>>
>> http://arxiv.org/abs/1001.0361
>>
>> Yassine Gargouri, Chawki Hajjem, Vincent Lariviere, Yves Gingras, Les
>> Carr, Tim Brody, Stevan Harnad
>>
>> ABSTRACT: Articles whose authors make them Open Access (OA) by
>> self-archiving them online are cited significantly more than articles
>> accessible only to subscribers. Some have suggested that this "OA
>> Advantage" may not be causal but just a self-selection bias, because
>> authors preferentially make higher-quality articles OA. To test this
>> we compared self-selective self-archiving with mandatory
>> self-archiving for a sample of 27,197 articles published 2002-2006 in
>> 1,984 journals. The OA Advantage proved just as high for both.
>> Logistic regression showed that the advantage is independent of other
>> correlates of citations (article age; journal impact factor; number
>> of
>> co-authors, references or pages; field; article type; or country) and
>> greatest for the most highly cited articles. The OA Advantage is
>> real,
>> independent and causal, but skewed. Its size is indeed correlated
>> with
>> quality, just as citations themselves are (the top 20% of articles
>> receive about 80% of all citations). The advantage is greater for the
>> more citeable articles, not because of a quality bias from authors
>> self-selecting what to make OA, but because of a quality advantage,
>> from users self-selecting what to use and cite, freed by OA from the
>> constraints of selective accessibility to subscribers only.
>>
>> http://eprints.ecs.soton.ac.uk/18346/
Received on Thu Jan 07 2010 - 19:37:25 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:50:01 GMT