Confirmation Bias and the Open Access Advantage: Some Methodological Suggestions for Davis's Citation Study

From: Stevan Harnad <amsciforum_at_GMAIL.COM>
Date: Sun, 24 Aug 2008 22:27:43 -0400

Confirmation Bias and the Open Access Advantage: 
Some Methodological Suggestions for Davis's Citation Study

Stevan Harnad


SUMMARY: Davis (2008) -- -- analyzes
citations from 2004-2007 in 11 biomedical journals. For 1,600 of the
11,000 articles (15%), their authors paid the publisher to make them
Open Access (OA). The outcome, confirming previous studies (on both
paid and unpaid OA), is a significant OA citation Advantage, but a
small one (21%, 4% of it correlated with other article variables such
as number of authors, references and pages). The author infers that
the size of the OA advantage in this biomedical sample has been
shrinking annually from 2004-2007, but the data suggest the opposite.
In order to draw valid conclusions from these data, the following
five further analyses are necessary:
          (1) The current analysis is based only on
      author-choice (paid) OA. Free OA self-archiving needs to
      be taken into account too, for the same journals and
      years, rather than being counted as non-OA, as in the
      current analysis.
          (2) The proportion of OA articles per journal per
      year needs to be reported and taken into account.
          (3) Estimates of journal and article quality and
      citability in the form of the Journal Impact Factor and
      the relation between the size of the OA Advantage and
      journal as well as article "citation-bracket" need to be
      taken into account. 
          (4) The sample-size for the highest-impact,
      largest-sample journal analyzed, PNAS, is restricted and
      is excluded from some of the analyses. An analysis of the
      full PNAS dataset is needed, for the entire 2004-2007
          (5) The analysis of the interaction between OA and
      time, 2004-2007, is based on retrospective data from a
      June 2008 total cumulative citation count. The analysis
      needs to be redone taking into account the dates of both
      the cited articles and the citing articles, otherwise
      article-age effects and any other real-time effects from
      2004-2008 are confounded.

The author proposes that an author self-selection bias for providing
OA to higher-quality articles (the Quality Bias, QB) is the primary
cause of the observed OA Advantage, but this study does not test or
show anything at all about the causal role of QB (or of any of the
other potential causal factors, such as Accessibility Advantage, AA,
Competitive Advantage, CA, Download Advantage, DA, Early Advantage,
EA, and Quality Advantage, QA). The author also suggests that paid OA
is not worth the cost, per extra citation. This is probably true, but
with OA self-archiving, both the OA and the extra citations are free.
