Re: OpenDOAR Search from Leslie Carr on 2006-10-27 (American-Scientist-Open-Access-Forum)

From: Leslie Carr <lac_at_ecs.soton.ac.uk>
Date: Fri, 27 Oct 2006 10:15:52 +0100

On 26 Oct 2006, at 19:00, Hubbard Bill wrote:

> Please find below an announcement from OpenDOAR for a search facility
> based on OpenDOAR holdings.

This is a very interesting service!

There was a discussion on this list at the beginning of August about
"Search Engines for Repositories Only". There were several attempts
to define constrained searches using RollYO or similar, but they all
suffered from one defect or another (too few sites, or logins
required etc). The Google Custom Search that OpenDOAR have set up
seems much more suitable to the repository community needs. Further,
it would seem to be fairly simple to set up Country-specific searches
(a la UKOLN's EPrints UK) by providing location-identifying
annotations for each repository.

I have had a go with this, and created a ROAR-based Repository Search
Engine at http://google.com/coop/cse?cx=009118135948994945300%
3Agvogitng0da
You can search all the ROAR repositories for a keyword and then Derek
Law can click on 'Scottish Research' to reduce the set of results to
those coming from the Scottish repositories (the "small and smart"
ones, according to his recent keynote at Open Scholarship :-)

There is a serious point that this opens up: why would we bother with
OAI-based repositories, if you can do it all with Google? The
advantage that OAI provided us was "metatdata", ie the possibility of
providing more accurate resource identification. The advantage of
repositories were that they provided an identifiable source of (well-
maintained) research material. Of course, the one can be simulated by
the other, and if Google could support a simple quality control
"refereed material" tag then we could get by without OAI and without
repositories.

Well, it doesn't, and so OAI still seems our best hope. However, even
with five years of OAI our repositories are not doing a very good job
of sharing metadata that helps a service to comprehend the status of
the holdings that it harvests (is this a published, refereed journal
article or equivalent? Is this a paper from an unrefereed workshop?
is this a chemical data file?) Too much is still down to
interpretation and subsequent data mining of the web pages. The
Eprints Application Profile (http://www.ukoln.ac.uk/repositories/
digirep/index/Eprints_Application_Profile) seems to be doing a good
job in achieving consensus in the use of Dublin Core, but there is an
urgent need for it to be implemented by all repositories!

We've spent a lot of time and effort on advocacy and policies over
the last couple of years, but I think it's time that we went back to
some of the technical fundamentals and made sure that our information
interoperability is up to scratch, otherwise we'll find ourselves in
a universe where the only thing you can do is a keyword search!

--
Les
(just my opinion)

Received on Fri Oct 27 2006 - 11:18:39 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:33 GMT