No search/findability problem for OA IRs

From: Stevan Harnad <>
Date: Mon, 11 Feb 2008 01:58:51 +0000

On Sun, 10 Feb 2008, David E. Wojick wrote:

> Steve, I do not regard establishing and maintaining IR's in tens
> of thousands of institutions around the world...

David, most of the major US, European, Australian, and Japanese
universities already have IRs, and the rest soon will too, in any case.

> and getting millions of authors to archive their writings each year...

The authors already do the keystrokes to write the paper, store it, submit
it and revise it. It's just a few more keystrokes per paper to deposit it.

> as simple or costing "next to nothing."

It's distributed, simple, and costs next to nothing per author/paper.
And all it needs institutionally is an IR and a mandate.

> Your version of OA looks to be quite burdensome indeed.

Not to the growing number of universities and funders that have actually
mandated it already. Stay tuned!

(By the way, this "version" of OA is called Green OA self-archiving.
Gold OA publishing is far slower, more complicated, more burdensome,
and cannot be mandated, but it too will probably come after 100% Green
OA has been reached.)

> In any case, my delivery solution does not require more metadata. I
> tend to think of metadata as obsolete as a search tool, and burdensome,
> preferring full text...

With OA IR mandates, you have full text too. Google has already inverted
it; and you can harvest it all if you like.

> My approach is exemplified by and the new
> , both of which already probably exceed
> plain Google in science content.

Congratulations! They both look very good! They'll be even better once
the IRs are mandated and filled, providing your two search engines with
much more content.

> (Google Scholar is largely pay-per-paper, not OA.)

I'm not sure what you mean. You don't pay GS anything, and if the linked
paper is a publisher toll-access site, GS provides any free alternatives
under "all versions" (if they exist). (But its a *good* thing that GS
indexes toll-access as well as OA papers.

> Our approach involves external federation of existing collections,
> which imposes no new burden on the institution, unlike Google's sitemap
> protocol. We are starting with the biggest collections first, then working
> our way down. If I have to find, federate and then track every college
> and institute IR in the world it will not be easy, hence my concern.

It will be easy. All the IRs will be OAI-compliant, harvested by
OAIster, and registered in ROAR and OpenDOAR. All you'll need to do is
harvest them from the listings in bulk.

> You also seem to be claiming that no further work is needed to
> improve the findability of raw access scientific content, until your
> (utopian?) vision of universal OA is completed. Needless to say I do not
> agree. There is much to be done, even if OA fails to materialize. One
> of our working principles is not to wait for visions to come true.

I am talking only about OA's target content. (There's plenty to do in
other areas too, but that has nothing to do with the OA and OA mandates
that are the only things I'm talking about.) What I resist resolutely
is anything that slows down or gets in OA's way by trying to load it
down or complicate it with needless burdens that apply to other kinds of
content, but need not apply to OA. (I also resist any implication that
OA is not enough or needs to wait to solve other technical problems such
as preservation, meta-data enhancement or search.)

Stevan Harnad

If you have adopted or plan to adopt a policy of providing Open Access
to your own research article output, please describe your policy at:

    BOAI-1 ("Green"): Publish your article in a suitable toll-access journal
    BOAI-2 ("Gold"): Publish your article in an open-access journal if/when
    a suitable one exists.
    in BOTH cases self-archive a supplementary version of your article
    in your own institutional repository.
Received on Mon Feb 11 2008 - 02:21:19 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:13 GMT