Re: Developing an agenda for institutional e-print archives

From: Philip Hunter <lispjh_at_UKOLN.AC.UK>
Date: Fri, 13 Jul 2001 19:46:49 +0100

John MacColl (SELLIC project, University of Edinburgh) has kindly given his
permission for this edited version of his report on the OAi day to be
circulated to the DNER list. The event will also be covered in the September
edition of Ariadne (issue 29).

'Developing an agenda for institutional e-print archives'
Report of Meeting, Institute of Mechanical Engineers, London. 11 July

The meeting was chaired by Sheila Corrall. The first presentation was given
by Catherine Grout, Assistant Director (Collections) of the DNER, and
essentially outlined the JISC/DNER perspective on the OA initiative, which
is that open archiving provides a technology for cementing the DNER
architecture. The DNER investment over the next few years will be primarily
in middleware, fusion and infrastructure services. Her assumption seemed to
be that content and presentation are already largely catered for (a view
which was challenged later during the moderated discussion at the end of the

JISC services supplying or facilitating access to content are the RDN and
MIMAS, which are seeking to make their metadata OAi-compliant. JISC has also
funded the eprints distribution work at Southampton, and is supporting the
Open Citations project. Tools, guidelines, best practice case studies and
pilot projects are all likely to be the sort of initiatives which JISC will
wish to fund. JISC will be interested also in projects involving communities
other than libraries.

Michael Nelson, (NASA) gave an entertaining historical overview of the OAI
'OAi past, present and future'. Distributed searching, the computing
science 'hammer' to the 'interoperability nail' is hard to do. There were
many attempts in the mid-90s, which failed. the OAI alternative, metadata
harvesting, proposed instead by Van de Sompel (now the e-Director of the
BL), Nelson, Lagoze and others, was also hard to do. Every archive had its
own different format. The repositories which were included at the beginning
included arXiv (physics), Cogprints (cognitive science), NDLTD (theses) and
RePEc (economics).

The OAi idea separates out data providers from service
providers. Data providers must provide methods for metadata harvesting. The
OAI is only about metadata not full-text. It is also neutral with respect
to the source of the metadata. The protocol, launched in January/February of
this year, has been frozen for 12-15 months to allow services to be built on
a stable platform. Nelson also explained the difference between OAi and OAIS
(Open Archival Information System), which is a developing standard for
digital preservation. This has confused a lot of people.

The protocol uses XML, which has lots of advantages (e.g. schemas to
determine compliance). But it is unforgiving and a strong disciplinarian,
in that it forces clean metadata. The OAi
protocol is always a front-end for another dataset: it has no interface for
record input or deletion. Eprints, for example, is an archiving system with
the OAi protocol built in. The protocol also supports sets to partition
archives, e.g. by discipline.

Stevan Harnad, Professor of Cognitive Science at the University of
Southampton, then gave a paper on 'The potential of institutional eprint
archives'. OAi has widened from its original focus on eprints, and Harnad
wanted to narrow the focus back to the original publication type (i.e. peer
reviewed papers). He now calls this the 'Self-Archiving Initiative'. He was
very much in favour of archiving at the institutional level rather than by
discipline - he argued that the motivation to do this is institutional,
since institutions lose when their own researchers work cannot be read by
other researchers, because they are debarred from access due to high
subscription costs. Harnad advocates that all research universities mandate
a CV with all published papers linked to an institutional archive. There is
therefore an explicit link there to RAE methodology, which could make the
RAE redundant (the impact would be measured by continuous assessment.)
Harnad has been trying to persuade a group of Provosts of elite US
universities to do this. In the UK, the people we need to persuade are the
Funding Councils, in order to change the methodology for research

After lunch, Paul Ayris, Director of Library Services at UCL, spoke on 'Why
research libraries need open archives'. The cumulative increase in the RPI
since 1986 is c. 50%; that in periodical prices is nearly 300% - while at
the same time library funding in real terms has dropped by about 1% over the
same period. The NESLI deals which have been brokered have proved difficult
for CURL, since they have been based on traditional spend on print journals.
This is effectively a tax on research. CURL wants to lobby for a general
review of STM publishing by the Director of Fair Trading. CURL will produce
advocacy packs for its member institutions for next academic session, to
alert Principals and Vice-Chancellors of the problems.

As Chair of the relevant CURL Task Force, Paul advocated the establishment
of OAi servers in institutions though consortial or regional models may
also be appropriate. Libraries should lead this. He asked about the costs of
OAi, in terms of staffing, metadata and infrastructure. There is also the
need to clarify the ownership of IPR. In the action plan he suggested,
Glasgow, Nottingham, Edinburgh, Southampton and Strathclyde are all setting
up archives: could JISC fund an evaluation of these? Can JISC funding also
be provided to support the establishment of archives in all institutions?
Charles Oppenheim mentioned in the Q&A that JISC is setting up an IPR
committee under Brian Fender, and including Charles. This will address many
of the issues which Paul had raised.

The next paper was given by Chris Rusbridge and William Nixon of the
University of Glasgow: 'Setting up an institutional eprints archive': what
involved? The Glasgow model is inclusive of all types of scholarly
publication, including reports, conference papers, monographs and book
chapters. They had hoped to invoke their archive in the current RAE, but
could not get things established in time. They were explicit about long-term
digital preservation not being part of the aim.

 The Glasgow server has only 15 papers at present. Some of the formats
supported by the eprints software were questioned by Chris (Word and HTML,
for example). Glasgow has added PDF and planning for XML. Being able to link
in to the authentication structures of the institution (as in single
sign-on) would be a good thing.
Links to Reference Manager should be supported, and a better audit trail is
required, as in submission date. He also asked whether any quality checking
should happen.

Chris Rusbridge spoke of his wish to set up an e-theses service at Glasgow,
possibly using NDLTD (Networked Digital Library of Theses and
Dissertations). City University is so far the only UK university in
membership of NDLTD. He said that NDLTD is likely to increase use of these
by 400%, according to Virginia Tech figures.

The final paper was by Rachel Heery: 'European support for Open Archive
She introduced a new European project, the Open Archives Forum, an
Accompanying Measure funded by the EC IST programme whose partners include
Humboldt University and IEI-CNR in Pisa. Part of the EC motivation in
funding this is to release the value of the invisible web (materials at a
deep level not often picked up by search engines); and to act as a focus for
dissemination, the collaborative development of software, and to help build
a community of interest. This work is also exploring some of the relevant
business models. It will also evaluate the OAi protocol technologies,
comparing the protocol with HARVEST and Z39.50, for example, and asking the
question whether DC is sufficiently rich as a metadata format.

The day then concluded with an open discussion session, moderated by Jan
Wilkinson, University Librarian at the University of Leeds. Gordon Dunsire
made a plea for the initiative to cover all scholarly material. That point,
however, had already been granted. There was a suggestion that an XML
cleanser be provided to the community (to clean up bad xml in metadata
records). Ronald Milne presented the case for disciplinary rather than
institutional archives, echoing a point made earlier by Peter Brophy. This
did not receive too much support, though the point was made that most papers
these days are written by authors from
several institutions. The JISC view on the day seemed to be that there
wasn't a contradiction between the two approaches (the metadata could be
produced on an institutional level, and another party might produce a
discipline based view).

Charles Oppenheim made the point that institutional frameworks can help
junior researchers. John MacColl suggested that JISC should fund a pilot
study in a small group of institutions to assess research impact, by
requiring that researcher CVs are deposited online with links to papers in a
local open archive, as Stevan Harnad had suggested in the morning session.
This should also assist in the filling of archives. There was some support
for this notion of a pilot initiative funded by JISC, from Paul Ayris. This
connected with Catherine Grout's reminder that there is a research contract
obliging this in the case of some research councils. Fred Friend also
supported the notion of tackling the Funding Councils on this. Chris
Rusbridge suggested that JISC may make such deposit a condition of any grant
it awards. Thomas Krichel suggested the creation of a disciplinary archive
in library and information science.

Sheila Corrall concluded the meeting with a summing-up: funding from JISC to
allow us to build on existing projects and to experiment; a side by side
institutional/disciplinary approach, since the two are not mutually
incompatible. She also suggested that JISC invite bids for imaginative
suggestions for populating archives, of either type. They should put the
funding up and invite us to bid imaginatively for it.

Philip Hunter, Information Officer at UKOLN, and Editor of Ariadne
UKOLN, c/o Library, University of Bath, Bath, BA2 7AY
Tel: +44 (0) 1225 826 354 Fax: +44 (0) 1225 826838
Received on Wed Jan 03 2001 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:46:11 GMT