Re: Free Access vs. Open Access from Matthew Cockerill on 2003-08-11 (American-Scientist-Open-Access-Forum)

From: Matthew Cockerill <matt_at_BIOMEDCENTRAL.COM>
Date: Mon, 11 Aug 2003 17:48:23 +0100

Stevan asks:

         "The use one makes of those full texts is to read them,
          print them off, quote/comment them, cite them, and use
          their *contents* in further research, building on them.
          What is "re-use"? And what is "redistribution" (when
          everyone on the planet with access to the web has access
          to the full-text of every such article)?"

Having free access to articles on the publisher's website would certainly
offer progress compared to the current status quo. But it would not offer
anything like the benefits of true open access. Here are just some of the
reasons why re-use and re-distribution rights are vital to open access:

(1) Digital permanence - it is not enough for the publisher to be the only
body which curates the full archive of published research content. To ensure
long term digital permanence of the scientific record, it is vital that
articles should be deposited with multiple archives, and redistributable
from and between those archives.

(2) A flexible choice of tools for searching and browsing
The reason that Google exists is because the web is free for anyone to
download and index. As a result, there is competition among search engines,
and Google had the incentive to develop a better system for indexing web
pages, which has since driven other search engine companies to improve the
tools they offer.

Compare this with the situation with scientific research. If the research
resides only on the publisher's site, you don't have a free choice of what
tools you use to search and browse it - you are stuck with what that
particular publisher provides you with.

This ties in with developments in Grid computing (e.g.
http://www.escience-grid.org.uk/ ). With open access, published research
would be available "on tap" via the grid, and scientists would be able to
use their preferred choice of grid tools to access the data, rather than
being stuck with the tools provided by the publisher.

(3) Datamining

With a million or so biomedical research articles being published each year,
the sheer volume of output is an obstacle to the comprehension and synthesis
of the results reported in that research. If the XML of the articles can be
brought together in one place then the tools of datamining can be applied to
it to extract useful but non-obvious information.

The simplest type of datamining is citation analyis

Currently you need to pay ISI a lot of money to find out what cites what,
but with true open access, citation analysis becomes trivial.

So, for example, if you view a PubMed record:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_ui
ds=11667947&dopt=Abstract
you already get links to all the full text articles in PubMed Central which
cite that PubMed item
http://www.pubmedcentral.gov/tocrender.fcgi?action=cited&tool=pubmed&pubmedi
d=11667947

The more true open access research that is published and archived at PubMed
Central, the more useful this becomes for biomedical researchers. [Sure,
"screen-scaping" HTML from free articles displayed on publisher sites could
give some citation information, but with nothing like the ease, accuracy and
reliability that it can be obtained with the use of XML data, as at PubMed
Central].

Beyond citation analysis, there are many other forms of datamining that are
possible:
For more information see:
http://www.biomedcentral.com/info/about/datamining/

e.g. Research articles can be mined for details of protein interactions
http://bioinfo.mshri.on.ca/prebind/

And as scientific content is increasingly marked up using richer forms of
semantically meaningful XML (e.g. CML for chemical structures, MathML for
equations), the value of datamining will continue to increase.

The BioLINK group are using BioMed Central's open access corpus as the raw
material for a datamining competition, designed to stimulate progress in the
development of tools for biological datamining.
http://www.pdg.cnb.uam.es/BioLINK/BioCreative_task2.html

(4) Derivative works and compilations
Say that a scientist performs a meta-analysis on a group of published
clinical trials, and wants to make available the conclusions of that
research. Or perhaps a datamining researcher has taken a corpus of 1000
articles breast cancer, and established some interesting conclusions.

In a true open access environment, each is free to post the results of their
research, *along with* the actual corpus of data which the research was
based on (effectively, the raw data for that research).
But in a non-open access environment, that raw data (i.e. the research
articles) cannot be redistributed, which makes it far more difficult than it
needs to be for other scientists to reproduce, critique and follow up the
work.

Similarly, a scientist may wish to make a point by assembling a collection
of certain articles or article fragments (perhaps they wish to assemble a
comparison of the methods used for a certain technique).
In an open access world, as long as they cite the sources, they are
completely free to create and redistribute that compilation. Such a
selective compilation may in itself be extremely useful contribution to
science.

(5) Print redistribution rights - the National Health Service, for example,
should be able to redistribute thousands of printed copies of an important
research article (which it may have funded) to its doctors if it wishes to
do so. It should not have to pay a hefty copyright fee for the privilege.
Certainly, print redistribution will likely become less significant in the
future, but there is no logical reason that the scientific community should
not be free to exchange and distribute the research that it has created in
print form, as well as online.

Matt Cockerill

==
Matthew Cockerill Ph.D.
Technical Director
BioMed Central Limited (http://www.biomedcentral.com)
34-42, Cleveland Street
London W1T 4LB

Email: matt_at_biomedcentral.com

> -----Original Message-----
> From: Stevan Harnad [mailto:harnad_at_ecs.soton.ac.uk]
> Sent: 11 August 2003 03:40
> To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM_at_LISTSERVER.SIGMAXI.ORG
> Subject: Free Access vs. Open Access
>
>
> BioMedCentral's "Open Access Now" is a useful newsletter, but
> its first
> editorial contains some inadvertently misleading information
> that needs
> to be corrected. What
> http://www.biomedcentral.com/openaccess/#article1
> actually said was this:
>
> > "Free Access is not Open Access"
> >
> > "There seems to be a general misunderstanding that the
> aim of the
> > Open Access movement is to make the scientific research
> literature
> > free online. But there is a difference between "free access"
> > and "open access"...
> >
> > "The benefits and promise of Open Access will only be realized
> > when this distinction is clear in the minds of authors and
> > publishers. Only then can the literature move from being `free'
> > to being truly `open'."
>
> I will quote/comment the full (short) editorial in a moment to show
> why I think what it *should* instead have said is this:
>
> "Open Access Calls for Both Free Access and Open Usage"
>
> "There seems to be a general misunderstanding that the
> aim of the Open
> Access movement is *only* to make the scientific research
> literature
> free online... That is the first aim, but it also aims to make it
> fully usable."
>
> The difference between the two messages is substantial. We
> are very far
> from having free access to the refereed research literature,
> even though
> it is within reach; vast amounts of potential research impact are for
> this reason being needlessly lost; and it is free access that
> is urgently
> needed to put an end to this loss. What free access we do have today,
> however, is not constrained by any usage constraints. Hence the
> difference between "free access" and "open access" is merely
> hypothetical right now: What is needed is more free access, not an
> extension of free access to open access. To imply otherwise
> is simply to
> saddle the research community with yet another red herring, instead of
> what it really needs.
>
> Here is the current situation, in rough practical and statistical
> terms:
>
> (a) What the BOAI seeks is unrestricted toll-free
> full-text online access to the entire refereed research corpus
> (20,000 journals, 2,000,000 articles per year).
> http://www.soros.org/openaccess/read.shtml
>
> (b) The way to achieve this is for researchers to (1)
> publish their
> papers in open-access journals whenever suitable ones exist
> (under 5% currently) and, for the rest of their papers (95%), to
> (2) self-archive them in their own institutional archives. [(1)
> is BOAI Strategy 2, and (2) is BOAI Strategy 1.]
> http://www.earlham.edu/~peters/fos/boaifaq.htm#journals
> http://www.eprints.org/self-faq/
>
> (c) Any form of restricted, gerrymandered online access (such as
> "ebrary"-based access that prevents down-loading, saving or
> printing-off) would not be open access (but there is none in sight
> so far to speak of).
>
> That is all there is to it! Now, for those who are interested, a more
> detailed quote/comment of the full (short) BMC editorial:
>
> > Free Access is not Open Access
>
> Not necessarily, in theory; but in reality and in practise,
> *all* of the
> growing body of research today that is free-access is also
> open-access:
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt
>
> It can all be downloaded, saved, grepped, printed out,
> quote/commented,
> and the URL can be sent to anyone who wishes to do likewise. All data
> therein can also be used, *exactly* as they could be if read
> and copied
> from the on-paper version. (It is simply an error, in other words, to
> think of refereed, published articles as analogous to the
> genome database
> or to software. It consists instead of texts, which are written to
> be printed off, read, used, applied, built-upon, quoted/commented,
> and cited. There is no question -- or need -- of republishing them or
> altering them. They are already freely accessible to anyone
> with access
> to the Web, and the only ones to update them are the authors; everyone
> else must settle for quote/commenting, applying and citing.)
>
> > There seems to be a general misunderstanding that the aim
> of the Open
> > Access movement is to make the scientific research literature free
> > online. But there is a difference between "free access" and
> "open access".
>
> The aim of the Open Access movement *is* to make the scientific
> (and scholarly) refereed-journal research literature -- full-text --
> accessible toll-free online. Though there may be hypothetical ways
> toll-free online access could be constrained so as to prevent
> downloading, grepping, or printing, no such thing is
> happening. All the
> free-access literature is also open-access.
>
> > This distinction was part of what motivated the Bethesda
> definition of
> > Open Access Principles that we published in the first issue
> of Open Access
> > Now (July 14, 2003). That definition clearly states that
> access to the
> > information should be free, but in addition the work should
> be open to
> > re-use and redistribution
>
> "Re-use and redistribution" has to be thought out more fully
> and clearly
> than it is in the Bethesda definition -- insofar as refereed journal
> articles are concerned. We are not talking about shared empirical
> databases here but about the articles that appear in the
> 20,000 existing
> (toll-access) peer-reviewed journals. The use one makes of those full
> texts is to read them, print them off, quote/comment them, cite them,
> and use their *contents* in further research, building on
> them. What is
> "re-use"? And what is "redistribution" (when everyone on the
> planet with
> access to the web has access to the full-text of every such article)?
>
> > and that it should be deposited immediately
> > upon publication in a public online repository (such as
> PubMed Central).
>
> For the 95% solution, BOAI-1, depositing those toll-access articles in
> the author's own institutional repository is the *means* by
> which they are
> made free-access, by definition. For the remaining 5% (BOAI-2),
> the fact that they are published by an open-access journal *entails*
> (again, by definition) that they must be made freely accessible online
> *somehow*. Likewise depositing them in a public online repository
> (whether in a central one, like PubMed Central, or -- why not? -- in
> the author's own institutional repository, this time too) seems like a
> congenial solution to providing this essential feature of what it is
> that makes an open-access journal open-access!
>
> > Publishers who offer free online access on their own
> websites still have
> > a long way to go before their research articles can be
> considered Open
> > Access.
>
> I know of no publisher-provided toll-free online full-text access with
> "ebrary"-style constraints on downloading, grepping,
> printing, etc. But
> if there *are* any such cases (and they can successfully
> prevent downloading,
> grepping, printing, etc.) then that sort of gerrymandered
> access should
> not count as open access, and that publisher certainly
> doesn't count as
> an open-access publisher.
>
> But what is the point? BOAI-1 is institutional self-archiving, not
> publisher self-archiving, and it involves no ebrary-style
> gerrymandering; and BOAI-2 *does* guarantee unconstrained
> access. The fact that toll-access publishers do *not* provide
> toll-free
> access is the whole point of the BOAI movement! If they did, we could
> all go home now (and access it all)!
>
> > The benefits and promise of Open Access will only be realized when
> > this distinction is clear in the minds of authors and publishers.
>
> I think authors know perfectly well when they can and cannot
> access the
> full text of an article (including download, storage, grepping
> and printout) toll-free. Toll-access publishers know the difference
> too. The difference between unconstrained free access and
> gerrymandered
> ebrary-style access will also be fully felt and appreciated -- if and
> when it ever comes to pass. So far, it's nowhere in sight! Hence, at
> the moment, *all* the benefits of Open Access reside in free,
> full-text,
> online access of the sort that a growing number of articles
> already have
> (most of them through BOAI-1) but that most of the 2,000,000 articles
> published annually still lack. It will not help them get it
> if we seek the
> benefits and promise from promoting the free/open distinction, rather
> than from promoting free access!
>
> > Only then can the literature move from being `free' to
> being truly `open'.
>
> The "move" we should all be dedicating 100% of our energy and
> attention
> to is the move from toll-access to free-access. That's the move that
> awaits us impatiently, to at last stem our daily needless
> impact-loss. There is no free-access literature straining to move from
> free-access to open-access anywhere in sight at the moment.
>
> Stevan Harnad
>
> NOTE: A complete archive of the ongoing discussion of providing open
> access to the peer-reviewed research literature online is available at
> the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):
>
> http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html
> or
> http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/index.html
>
> Discussion can be posted to: american-scientist-open-access-forum_at_amsci.org
>

________________________________________________________________________
This email has been scanned for all viruses by the MessageLabs Email
Security System. For more information on a proactive email security
service working around the clock, around the globe, visit
http://www.messagelabs.com
________________________________________________________________________
Received on Mon Aug 11 2003 - 17:48:23 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:02 GMT