On Not Conflating Open Data (OD) With Open Access (OA)

From: Stevan Harnad <amsciforum_at_GMAIL.COM>
Date: Thu, 20 May 2010 08:21:18 -0500

On Thu, May 20, 2010 at 5:52 AM, [identity deleted] wrote:

> I hope you don’t mind my asking you for guidance – I follow the IR list and
> you are obviously expert in this area.
>
> I am having a debate with a colleague who argues that forcing researchers to
> give up their data to archives and repositories breeches their autonomy and
> control over intellectual property.  He goes so far as to position the
> entire open access movement in the camp of the neoliberal agenda of
> commodifying knowledge for capitalist dominated state authority (at the
> expense of researchers – often very junior team members – who actually
> create the data).

It is important to distinguish OA (Open Access to refereed research
journal articles) from Open Data (Open Access to research data).

http://users.ecs.soton.ac.uk/harnad/Temp/OAwhat.png

All researchers, without exception, want to maximise access to their
refereed research findings as soon as they are accepted for
publication by a refereed journal, in order to maximise their uptake,
usage and impact. Otherwise they would not be providing access to
them, by publishing them. The impact of their research findings is
what their careers, as well as research progress, are all about,.

But raw data are not research findings until they have been data-mined
and analysed. Hence, by the same token (except in rare exceptions),
researchers are not merely data-gatherers, collecting data so that
others can go on to do the data-mining and analysis: In science
especially, their data-collection is driven by their theories, and
their attempts to test and validate them. In the humanities too, the
intellectual contributions are rarely databases themselves, but the
author's analysis and interpretation of them -- and often in books,
which are not part of OA's primary target content, because books are
definitely not all or mostly giveaway content, written solely to
maximise their uptake, uages and impact (at least not yet).

In short, with good reason, OD is not exception-free author give-away
content, whereas OA is. It may be reasonable, when data-gathering is
funded, that the funders stipulate how long the data may be held for
exclusive data-analysis by the fundee, before it must be made openly
accessible. But in general, primary research data -- just like books,
software, audio, video, and unrefereed research -- are not amenable to
OA mandates because there may be good reasons why their creators do
not wish to make them OA, at least not immediately. Indeed, that is
the reason that all OA mandates, whether by funders or universities,
are very specifically restricted to refereed research journal
publication.

In the new world of OA mandates, which is merely a PostGutenberg
successor to the world of "publish-or-perish" mandates, it is
critically important to carefully distinguish what is required (and
why) from what is merely recommended (and why).

> I agree there is a risk of misuse and appropriation of the open access
> agenda, but that is true for any technology, or any social change more
> generally.

Researchers' unwillingness to make their laboriously gathered data
immediately OA is not just out of fear of misuse and misappropriation.
It is much closer to the reason that a sculptor does not do the hard
work of mining rock for a sculpture only in order to put the rock on
craigslist for anyone to buy and sculpt for themselves, let alone
putting it on the street corner for anyone to take home and sculpt for
free. That just isn't what sculpture is about. And the same is true of
research (apart from some rare exceptions, like the human genome
project, where the research itself is the data-gathering, and the
research findings are the data).

> And I believe researchers generally have more to gain than lose
> from sharing data but hard evidence on this point – again for data, not
> outputs, is almost non-existent so far. If you can direct me to any articles
> or arguments, I would be grateful.

There is no hard evidence on this because -- except in exceptional
cases -- it is simply not true. The work of science and scholarship
does not end with data-gathering, it begins with it, and motivates it.
If funders and universities mandated away the motivation to gather the
data, they would not be left with an obedient set of data-gatherers,
duly continuing to gather data so that anyone and everyone could then
go ahead and data-mine it. They would simply mandate away much of the
incentive to gather the data in the first place.

To put it another way: The embargo on making refereed research
articles immediately OA -- the access delay that publishers seek in
order to protect their revenue -- is the tail wagging the dog:
Research progress and researchers' careers do not exist in the service
of publishers' revenues, but vice versa. In stark contrast to this,
howeverm the "embargo" on making primary research data OD is necessary
(in most cases) if researchers are to have any incentive for gathering
data (and doing research) at all.

The length of the embargo is another matter, and can and should be
negotiated by research funders on a field by field or even a case by
case basis.

So although it is crucial not to conflate OA and OD (thereby
needlessly eliciting author resistance to OA when all they really want
to resist is immediate OD), there is indeed a connection between OA
and OD, and universal OA will undoubtedly encourage more OD to
provided, sooner, than the current status quo does.

> An important point in addition is that the archives I work with, while
> aspiring to openness, cannot adopt full and unqualified open access.  Issues
> of sensitive and confidential data, and consent terms from human research
> subjects, have to be respected.  We strive to make data as open and free as
> possible, subject to these limits.  Typically, agreeing to a licence
> specifying legal and ethical use is all that is required.  So in fact,
> researchers do retain control, to some extent, over the terms and conditions
> of reuse when they deposit their data for sharing in data archives.

Yes, of course even OD will need to have some access restrictions, but
that is not the point, and that is not why researchers in general have
good reason not be favorably disposed to immediate mandatory OD --
whereas they have no reason at all not to be favorably disposed to
immediate mandatory OA.

It is also important to bear in mind that the fundamental motivation
for OA is research access and progress, not research archiving and
preservation (although those are of course important too). Data must
of course be archived and preserved as well, but that, again, is not
OD. Closed Access data-archiving would serve that purpose -- and to
the extent that researchers store digital data in any form, closed
access digital archiving is what all researchers do already. Proposing
to help them with data-preservation is not the same thing as proposing
that they make their data immediately OD.

Stevan Harnad
Received on Thu May 20 2010 - 14:22:22 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:50:09 GMT