Re: On Not Conflating Open Data (OD) With Open Access (OA)

From: Stevan Harnad <amsciforum_at_GMAIL.COM>
Date: Thu, 20 May 2010 16:11:37 -0400

When should research data be made OD? Not immediately upon
collection, since then the collectors lose the first crack at mining
their own hard-won data.

Benjamin Geer suggests immediately upon publication (presumably
the publication of a refereed journal article based on the
data in question). But the first of the collector's articles based on
that collection or the last? How many are allowed with exclusivity?
and how long?

That's why I said there are many other questions and problems peculiar
to OD that are not shared with OA (and should not be linked to OA,
thereby making consensus on adopting an OA mandate
harder to reach, or less likely to be complied with).

That said, the data on which a publication is based should immediately
open to auditing -- but not necessarily OD.

Some more replies below:

On Thu, May 20, 2010 at 11:06 AM, Benjamin Geer benjamin.geer -- wrote:

> Stevan, although I agree with you on Open Access, I disagree with you on
> Open Data. There are strong arguments for making scientific data publicly
> accessible at the time of publication.

At the time of the first publication the author derives from that
data-set? He gets only one exclusive crack at it?

> Norms of methodological transparency encourage honesty in the reporting of
> research results. In a worst-case scenario, pressures for career
> advancement, tenure, or prestige may create perverse incentives to “publish
> or perish” that, if not countered with some form of accountability, can
> easily lead researchers to misstate conclusions.

Many forms of accountability are possible that are short of immediate OD.

> Yet erroneous inferences may
> not even necessarily result from nefarious intentions. Simple coding errors
> or a flawed syntax file can produce results that the investigator believes
> to be correct even when they are not. Making the data and programming
> decisions publicly available limits the extent to which bad findings
> influence future research.

They are already open to the referees, if requested. They ought to be
open to some auditors too. But OD is rather more than that.

>There are, in fact, ample examples of errors in quantitative analysis
> leading to—at best—ambiguity in findings. One replication of a 1986 American
> Sociological Review article led to a debate over whether four different
> couples in the analyzed survey sample were really having sex 88 times a
> month, or if the 88s in the data file were actually meant to refer to
> missing observations (Jasso 1985, 1986; Kahn and Udry 1986). A broader
> study in economics by Dewald, Thursby, and Anderson (1986) sought to
> replicate a year’s worth of articles in the Journal of Money, Credit, and
> Banking. The principal finding was that, in the vast majority of cases, it
> was entirely impossible to exactly replicate the published results even with
> the help of the articles’ original authors. This led to the adoption of more
> stringent requirements in journals such as the American Economic Review
> requiring that data be made available at the time of publication."
> Jeremy J. Albright and Jared A. Lyle, “Data Preservation Through Data
> Archives,” PS: Political Science & Politics 43, no. 01 (2010): 17-21.

All very important, and reason for rigorous refereeing and aggressive
auditing -- but not yet OD.

> I also disagree that mandating open data will remove researchers' incentive
> to collect the data in the first place. Their incentive to collect the data
> will still be that they will get the first opportunity to interpret the data
> and publish their interpretation.

Just until their first publication?

> If the data are really novel and their
> interpretation is valid, that is all they need to advance their careers as
> researchers.

One publication? What if they've gathered a lot of time-consuming
data, amenable to a lot of time-consuming analysis?

> Embargoes on the publication of data just slow down science,
> holding it hostage to the self-interest of a single researcher, by giving
> that researcher a monopoly on the use of the data in question, forbidding
> others from attempting to verify that researcher's interpretation.

In some cases you may well be right. But it's not clear whether that's
most cases. In contrast, OA is exception-free: from the moment of
acceptance for publication, refereed research findings can and should
immediately be maDE OA.


> Ben
Received on Thu May 20 2010 - 21:12:16 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:50:10 GMT