Re: ARL Publishes SPEC Kit 292: Institutional Repositories

From: Charles W. Bailey, Jr. <cbailey_at_UH.EDU>
Date: Wed, 23 Aug 2006 18:37:33 -0500

    [ The following text is in the "windows-1252" character set. ]
    [ Your display is set for the "iso-8859-1" character set. ]
    [ Some characters may be displayed incorrectly. ]


Thanks for your comments.

The "Survey Questions and Responses" section
of the document, which presents the detailed
findings, goes from page 23 to page 89 of
the document. Clearly, there is a significant
amount of data. The typical length of a SPEC Kit
Executive Summary is around 1,500 words. The authors
strongly advocated a longer summary, and the compromise
length was around 5,100 words. In an attempt to make
more data freely available in the Executive Summary, many
drafts were crammed with more data, but the result was
very difficult to read. The final document attempted to
accurately report the key findings given space and
time limits; however, it was written to be an introductory
part of a larger document, not as a stand-alone, highly
analytic research report.

Access to the SPEC Kit contents is determined by
ARL's publishing policies for this type
of document. As noted in the initial message, ARL
staff invested quite a bit of effort in producing
the document; it was not simply delivered to them
ready to be edited, typeset, and published as was the
Open Access Bibliography, which was published by
ARL under a Creative Commons License and made OA.

Regarding the idea that IR start-up costs are
in the $10,000 range, 4 out of 15 respondents
with operational IRs reported costs below $25,000;
5 reported costs between $25,000 and $49,999;
2 reported costs between $50,000 and $74,999;
1 reported costs between $75,000 and $99,999;
and 3 reported costs greater than or equal to

Regarding staffing figures, the question 8 was:

"Please describe the various units (up to four) in your
institution that have/will have responsibility for the
ongoing operation of the IR. These units may be within the
library, the institution^s IT unit, or some other unit. It
is understood that many units in the library (or elsewhere)
will play a part in overall IR support; however, the intent
of this question is to identify the major players. Indicate
name of the unit, the unit^s IR responsibilities, the title
of the unit manager, the title of the person to whom the
unit reports, the number of individuals in each staff
category, and total FTE in each staff category. Please
provide any comments that help explain the responsibilities
for the ongoing management of the IR."

For each unit, a table shows the name of the unit, its
IR responsibility, the manager's title, and who the unit
reports to. The data was then analyzed by unit/type of staff,
broken down by number of staff and FTE. These tables show minimum,
maximum, mean, median, and standard deviation values. There are
also extensive comments (to assist in data interpretation
we decided to publish all comments throughout the document.)

The "number of staff" figures were higher than the FTE figures,
which appeared to be the better measure to focus on in the summary.
The FTE figures allowed for the fact that an individual may not
be spending all of his/her time on the IR or that the individual
may be part-time.

Once you get past the first unit, a wide range of library units
become involved in IR support, not just core technical units.

This may explain why respondents with operational IRs
identify staffing and benefits as the top start-up
and operational costs.

One section of question 9 asked:

"Please estimate the percentage of the budget allocated to each
of the following categories."

For respondents that had an operational IR, the mean results
for start-up costs were:

Staffing and benefits: 63.3%
Hardware acquisition: 25.6%
Software acquisition: 23.0%
Hardware maintenance: 9.2%
Software maintenance: 6.0%
Vendor fees (if IR is hosted
by an external vendor): 70.2

For respondents that had an operational IR, the mean results
for ongoing operation costs were:

Staffing and benefits: 68.3%
Hardware acquisition: 23.3%
Software acquisition: 14.5%
Hardware maintenance: 10.3%
Software maintenance: 11.5%
Vendor fees (if IR is hosted
by an external vendor): 73.8%

The staffing requirements issue is one that certainly
deserves further study.

As to whether the main purpose of most of these projects was OA,
question three asked "What motivated your institution to establish an IR?
Check all that apply."

The top four reasons for respondents that had operational IRs were:

Increase global visibility of institution^s scholarship: 97%
Preserve institution's scholarship: 95%
Provide free access to institution^s scholarship: 89%
Collect and organize institution^s scholarship in a single system: 89%

The top five types of digital objects that respondents with operational
IRs identified as being in their IRs were:

Electronic theses and dissertations: 67%
Articles, preprints: 61%
Articles, postprints (author modifies preprint to match
published work: 61%
Conference presentations: 50%
Technical reports: 50%

As to whether "the 22% of ARL libraries not planning an IR are
dismayed by misleading cost data," the answer is clearly "no" because,
prior to this survey, there was no such ARL IR cost data, misleading or
not, to be dismayed about.

When thinking about this survey, it is important to remember that
it was limited to ARL libraries.

What is ARL?

"ARL is a nonprofit organization of 123 research libraries
at comprehensive, research-extensive institutions in the US
and Canada that share similar research missions,
aspirations, and achievements. It is an important and
distinctive association because of its membership and the
nature of the institutions represented. ARL member libraries
make up a large portion of the academic and research library
marketplace, spending more than one billion dollars every
year on library materials."

What libraries are in ARL?

Our survey presents data reported by these research libraries about
how they have established and operate IRs (or how they plan
to do so). In other types of institutions, other libraries
(or other institutional units) may make quite different decisions
or plans, and a survey of their efforts could yield quite
different results.


Date: Mon, 21 Aug 2006 20:36:16 -0400
Reply-To: dgoodman_at_Princeton.EDU
Sender: American Scientist Open Access Forum
From: David Goodman <dgoodman_at_PRINCETON.EDU>
Subject: Re: ARL Publishes SPEC Kit 292: Institutional Repositories
In-Reply-To: <>
Content-Type: text/plain; charset=us-ascii

My own detailed reading of the data in the summary supports
Stevan's comments, and indicates that planners of IRs should
consider the summary data with care:

quoting from p. 16, "For start-up, 67% of budgets fall below
$75,000, 14% are $75,000 to $125,000, and 19% are $150,000
or greater. The maximum start-up budget ($1,800,000) is far
greater than the next highest ($400,000) and is from an
institution that included extensive software development and
testing costs in its start-up budget. For ongoing budgets,
there is a similar concentration at the ends of the ranges:
50% are below $50,000 and 50% are $100,000 or greater. The
maximum ongoing budget ($500,000) is also much greater than
the next highest ($300,000) and is reported by an
institution that has a major role in a state-wide IR

They also give the median, which for those have implemented
the repository is $45,000 start up (with a minimum value of
$8,000) and $42,000 operating costs (with a minimum value of

It is of course possible to spend a great deal more--in the
million dollar range--if one counts the costs of developing
a large-scale-wide network, or extensive new software.
Developing a such a system is outside the category of
Institutional Repositories, and should not have been

If i were giving a quick summary, I would emphasize the low
end, especially for the start up costs: It can be done for
less than $10,000, even at a large research university. (And
all the ARL libraries are by definition at large research

I wonder at the staffing figures. Quoting from p.21 "The
typical IR was supported by about 28 FTE" This is impossibly
high, even for the single most expensive project. I think
what was intended might have been: "The total number of the
staff working for at least a small part of their time on the
IR project were from 28FTE positions," and this might have
perhaps 30 individuals. I do not see the purpose of
presenting these numbers as FTE--one only needs to figure
out the total effort.

The main purpose of most of these projects was not OA. It
was ETD (Electronic Theses and Dissertations.) (p.18) For
three-quarters of the libraries this was the major type of
deposit. This is of course very valuable also, and many of
the key developers of ETD archives are also involved in IRs
for OA. There may also be reasons for starting with ETDs-it
is trivial to require their authors to deposit. They are
eager to get their degree, and are in no position to refuse
or delay.

I will be glad to revise this in the light of the full
report, if the authors will send it to me. If one publishes
with only the summary OA, one will be judged by that
summary. This places a special responsibility on the authors
of a report such as this to ensure that the key values
properly represent the data, because such numbers alone will
be what is quoted.

Perhaps the 22% of ARL libraaries not planning an IR are
dismayed by misleading cost data.

Dr. David Goodman formerly Bibliographer and Research
Librarian Princeton University Library

Best Regards,
Charles W. Bailey, Jr., Assistant Dean for Digital Library
Planning and Development, University of Houston Libraries
(Provides access to DigitalKoans, Open Access Bibliography,
Open Access Webliography, Scholarly Electronic Publishing
Bibliography, Scholarly Electronic Publishing Weblog,
and other publications.)
Received on Thu Aug 24 2006 - 02:50:38 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:28 GMT