     Here is a reply by Maurits van der Graaf to some questions
     from Prof. T.D. Wilson and myself about whether the Driver study

     over-estimates current deposit rates. The main result seems to
     be that about 15% of repository content (average 9000 items) is
     full-text journal articles. What percentage this represents of
     annual institutional output is not determined, but when asked to
     estimate it, authors' estimates averaged 37%. [There is a
     need to check on what the authors based their estimates.] -- SH

Professor Wilson raises in his comments on the DRIVER Inventory Study a
very interesting topic: how successful are the (mostly institutional)
repositories in covering the research output of their institutes?

In our study we asked the managers of the repositories for research output
throughout the European Union to provide data on the contents such as the
type of material covered, numbers in total etc. Using a number of sources,
we identified approximately 230 institutes with a possible repository for
research output in the European Union and approached them to participate in
this inventory study. In all, 114 repositories from 17 countries

Based on figures provided by 104 repositories, it appears
that on average digital repositories contained nearly 9000 records (8984,
as assessed in the second half of 2006). The large majority of these
records (90%) relate to textual materials: these records can be split in
metadata-only records (61%) and full text records (29%).

(5% of the records relate to non-textual materials such as images, video,
music and primary datasets. The 5% 'other materials' relate to learning
materials, students papers etc.)

What types of textual materials [of the 29% of the average 9000 records
per repository] are deposited? More than half of the textual materials
relate to journal articles (54%), a smaller share are for books or book
chapters (19%). Theses, proceedings and working papers - often labelled
as grey literature - have a share of 29%.

[I.e., about 15% of the average 9000 are journal articles]

In another question we asked the respondents of this survey to estimate the
percentage of the research output from their institute of 2005 deposited in
their repository. The average percentage estimated was 37%.

How do these figures relate to other studies? A recent study by the
Association of Research Libraries surveyed 87 research institutes with 31
operational institutional repositories. They find that a typical
institutional repository holds about 3800 digital objects (SPEC KIT 292,
Institutional Repositories, July 2006, ARL).

A much broader survey (2147 libraries in the USA contacted, 446
participants) identifies 48 operational repositories. Of those, 50%
contains less than 1000 digital documents, and nearly 20% more than
5000 items.

Although lower than our figures, these American surveys suggest that
research repositories contain thousands of items, instead of the hundreds
of items found by the study of Professor Wilson among 22 UK research

The discrepancy between our numbers and the numbers of Professor Wilson
could be caused by:

     A different selection of institutional repositories:
     our selection (although the largest study on operational
     repositories so far) might be biased to more active and more
     successful repositories.

     Timing of the survey:
     Professor Wilson includes numbers up to 2004, since then many
     research institutes have accelerated their activities with regard
     to repositories for research output.

     Different records identified:
     we included also metadata-only records in the numbers; the inclusion
     criteria of the other surveys are not explicitly stated.

The DRIVER project aims to put a test-bed in place across Europe to assist
the development of a knowledge infrastructure, based on repositories for
research output. For that purpose, our survey aimed to make an inventory of
the current state of repositories in the European Union. Based on its
results, we believe that the situation with regard to covering research
output by institutional repositories is better than as suggested by
Professor Wilson. But even with this more positive outlook, coverage of
research output remains a crucial element in the further development of
repositories and the proposed knowledge infrastructure.
Please see for the entire study and options to comment on the study

> Forwarded from BOAI Forum: Important corrections from Professor Wilson
> regarding the true rate and proportion of spontaneous self-archiving
> of article full-texts. This rate and proportion is almost certainly
> over-estimated by the Driver Study -- but that only reinforces its
> recommendation that self-archiving needs to be mandated. -- SH
> One of the items in Peter Suber's OA News made me raise my eyebrows:
> he quotes from the DRIVER report that:
> "On average, the estimated percentage of research output of 2005 deposited in
> the digital repositories is 37%"
> This, of course, is a Europe-wide study and perhaps the success of repositories
> varies considerably from country to country. In the study I did last year of
> the UK repositories, I estimated that they contained something in the order of
> 3% of the research output from the UK in 2004 - I would be very surprised if
> they had improved tenfold in one year.
> Looking further into the report I see that only 57 UK institutions were invited
> to respond to the investigation and, of these, only 51% responded (and we can
> assume that those who respond are most interested in the subject under
> investigation). The report also notes that:
> "On average a digital repository contains in total 8,984 items."
> Meaning, of course, items of all kinds, not solely journal papers. Again, this
> figure contrasts starkly with the situation I found in the UK, where, the
> combined total of journal papers in ALL of 21 repositories available for study
> was 9,739 - an average (meaningless, of course, as any average of a skewed
> distribution) of 464 items per repository. In fact the totals by institution
> ranged from 2 items in total to 5,139 items, with a median value of 78 items.
> A comparison of these data with those form the DRIVER report still leaves me
> puzzled :-)
