Enhanced metadata, interaperability and searchability

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Fri, 27 Oct 2006 16:43:30 +0100

> at CCLRC (so-called formalised DC)... it is strongly interlinked with CERIF
> (the data/metadata standard for research information maintained by euroCRIS
> http://www.eurocris.org). Stevan in particular will remember all this from the
> CRIS2006 conference which he attended. http://ct.eurocris.org/CRIS2006/

I not only remember the Eurocris meeting, but I strongly endorse CERIF
and cris's! But let there be no misunderstanding: OA's priority today is
*content*, not search, or enhanced interoperability. OAI (or even less) is
enough for now: What's missing and urgent is not enhanced interoperability
but OA content (the absent 85% of it). On no account should either
self-archivers or self-archiving mandaters wait for or weight themselves down
with enhanced metadata schemes at this time! On the contrary, once
OA content heads reliably and rapidly towards 100%, the enhanced
interoperability will not be far behind!

Stevan Harnad

On Fri, 27 Oct 2006, Jeffery, KG (Keith) wrote:

> All -
>
> I agree with Les that we still need repositories and that Google, even
> if customised, is still a rather blunt instrument.
>
> The problem with the metadata is fairly obvious; it is machine readable
> but not machine understandable i.e. the syntax is rather loose and the
> semantics almost non-existent. This results in the end-user having to
> browse on screen to achieve the required degree of recall and relevance
> - a time-consuming and non-scalable way forward.
>
> If this is compared with the formal metadata of a DBMS schema, or that
> associated with any particular, specialised domain of scientific
> research for data exchange / access then the difference is obvious
> immediately.
>
> We need formalised metadata that ensures (heterogeneous) computer
> software systems can interoperate using it. We have to resolve character
> set, language, syntax and semantics. We've had a go at this at CCLRC
> (so-called formalised DC) and it is strongly interlinked with CERIF (the
> data/metadata standard for research information maintained by euroCRIS
> www.eurocris.org). Stevan in particular will remember all this from the
> CRIS2006 conference which he attended.
> ------------------------------------------------------------------------
> --------------------------------------------------
> Prof Keith G Jeffery Director Information Technology
> and International Strategy
> kgj_at_rl.ac.uk CCLRC Rutherford Appleton Laboratory
> T:+44 1235 44 6103 Chilton, Didcot, OXON OX11 0QX UK
> F:+44 1235 44 5147
> WWW Person: http://www.bitd.clrc.ac.uk/Person/K.G.Jeffery
> Department: http://www.bitd.clrc.ac.uk
> President ERCIM & CCLRC Director: http://www.ercim.org/
> W3C Office at CLRC-RAL http://www.w3.org/
> President euroCRIS http://www.eurocris.org/
> VLDB Trustee Emeritus: http://www.vldb.org/
> EDBT Board Member http://www.edbt.org/
>
> -----Original Message-----
> From: American Scientist Open Access Forum
> To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM_at_LISTSERVER.SIGMAXI.ORG
> Date: Fri, 27 Oct 2006 10:15:52 +0100
> From: Leslie Carr <lac_at_ecs.soton.ac.uk>
> To: JISC-REPOSITORIES_at_JISCMAIL.AC.UK
> Subject: Re: OpenDOAR Search
>
> On 26 Oct 2006, at 19:00, Hubbard Bill wrote:
>
> > Please find below an announcement from OpenDOAR for a search facility
> > based on OpenDOAR holdings.
>
> This is a very interesting service!
>
> There was a discussion on this list at the beginning of August about
> "Search Engines for Repositories Only". There were several attempts to
> define constrained searches using RollYO or similar, but they all
> suffered from one defect or another (too few sites, or logins required
> etc). The Google Custom Search that OpenDOAR have set up seems much more
> suitable to the repository community needs. Further, it would seem to be
> fairly simple to set up Country-specific searches (a la UKOLN's EPrints
> UK) by providing location-identifying annotations for each repository.
>
> I have had a go with this, and created a ROAR-based Repository Search
> Engine at http://google.com/coop/cse?cx=009118135948994945300%
> 3Agvogitng0da
> You can search all the ROAR repositories for a keyword and then Derek
> Law can click on 'Scottish Research' to reduce the set of results to
> those coming from the Scottish repositories (the "small and smart"
> ones, according to his recent keynote at Open Scholarship :-)
>
> There is a serious point that this opens up: why would we bother with
> OAI-based repositories, if you can do it all with Google? The advantage
> that OAI provided us was "metatdata", ie the possibility of providing
> more accurate resource identification. The advantage of repositories
> were that they provided an identifiable source of (well-
> maintained) research material. Of course, the one can be simulated by
> the other, and if Google could support a simple quality control
> "refereed material" tag then we could get by without OAI and without
> repositories.
>
> Well, it doesn't, and so OAI still seems our best hope. However, even
> with five years of OAI our repositories are not doing a very good job of
> sharing metadata that helps a service to comprehend the status of the
> holdings that it harvests (is this a published, refereed journal
> article or equivalent? Is this a paper from an unrefereed workshop?
> is this a chemical data file?) Too much is still down to interpretation
> and subsequent data mining of the web pages. The Eprints Application
> Profile (http://www.ukoln.ac.uk/repositories/
> digirep/index/Eprints_Application_Profile) seems to be doing a good job
> in achieving consensus in the use of Dublin Core, but there is an urgent
> need for it to be implemented by all repositories!
>
> We've spent a lot of time and effort on advocacy and policies over the
> last couple of years, but I think it's time that we went back to some of
> the technical fundamentals and made sure that our information
> interoperability is up to scratch, otherwise we'll find ourselves in a
> universe where the only thing you can do is a keyword search!
> --
> Les
> (just my opinion)
>
Received on Fri Oct 27 2006 - 16:57:30 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:33 GMT