Re: Against Conflating OA Self-Archiving With Preservation-Archiving

From: Steve Hitchcock <>
Date: Thu, 16 Aug 2007 17:35:02 +0100

Peter Hirtle takes issue with the terminology, but his main point
seems to be that repositories are not doing enough to be trusted with
preservation. Why single out repositories? The field of preservation
is experiencing a step change in the switch to digital content, and
new and best practices are still being developed in all areas.

What we can say about institutional repositories is that careful
management and organisation enhance the prospects for preserving
content, and this is much better than random Web sites still used by
many authors. In terms of more formal preservation approaches, there
are now quite a number of projects investigating preservation
services for repositories - I work on one, but you will find most of
them mentioned in this paper What these
projects are trying to do is apply emerging best practices in digital
preservation to IRs.

If to some extent preservation best practice can be left to services,
the primary issues facing IRs in terms of working with these services
are (1) content, (2) financial resources. The two are linked, of
course, by policy. The best resourced contents wrt preservation tend
to those that can argue the case for 'national heritage', such as
public records, the contents of national libraries, etc. That doesn't
apply yet to IRs, and in the absence of this it is known that
preservation can be a hard sell

This brings us to content. Resources for preservation will only
follow where there is sufficient content, and this will require
institutional commitment to the repository and presupposes formal,
institutional policy to underpin the repository. Preservation will
follow content, and content follows effective policy. In turn,
content and policy provide the basis of preservation planning. As
Stevan consistently points out, there aren't yet enough of either. At
the moment, the business case will agree, not least because in the
case of IRs the institutional decision-maker on policy and resources
is likely to be one and the same.

At the same time we don't have to wait for the whole OA corpus to be
complete before we consider what to do. If, as looks likely,
preservation becomes a repository-by-repository responsibility, there
are some repositories now with sufficient content for us to be able
to understand and anticipate the likely needs, and to build and test
services. Our aim is to see those services are ready when needed. To
get there we need a two-way dialogue, between repositories and
services, to match the technical needs with the repository and
business requirements.

Peter's misgivings about repository preservation are premature. Best
practices for digital preservation, when they are more widely
applied, will find their way into repository preservation services.
The business case has to come into line before best practices can be
offered as viable services, but it is not too early for repositories
to plan for preservation, and they will be making a start by
developing an institutionally-backed repository policy.

Steve Hitchcock
Preserv Project Manager
IAM Group, School of Electronics and Computer Science
University of Southampton, SO17 1BJ, UK
Tel: +44 (0)23 8059 7698 Fax: +44 (0)23 8059 2865

At 23:22 15/08/2007, Stevan Harnad wrote:
> Prior Amsci Topic Thread:
> "Against Conflating OA Self-Archiving With Preservation-Archiving"
> On Tue, 14 Aug 2007, Peter Hirtle wrote:
> > When I speak of "archives"... it is as an archivist.
> > For my community, a term like "self-archiving" is an oxymoron -
> I think we need not be that rigid with the word "archiving." It really
> just means storing. The relevant thing for OA is that self-archiving
> provides free online access, not that the deposit is being preserved.
> (It *is* being preserved, too, but that is not the point: We are talking
> about authors' final accepted drafts, not the publisher's PDF or the paper
> edition: the latter is the one that preservationists should be preoccupied
> with. The self-archived version is a supplement, not a substitute.)
> > Self-archiving and open access are fine for providing immediate
> > access to one's work. I have used both.
> That's it. And hence the discussion should really end there, insofar as
> OA is concerned...
> > But no self-archive or
> > open access system (or institutional repository, for that matter)
> > yet meets the standards established for an Open Archival
> > Information System-compliant (yet another "archive"), Trusted
> > Digital Repository.
> So what? OA is about the Access Problem, not the Preservation Problem.
> > What is worse, as I argued in a paper in the
> > April 15th issue of RLG DigiNews, most of the publishers that
> > allow one to deposit post-prints in an institutional repository
> > do not grant authors the rights to given to the repositories the
> > permissions they need in order to be able to preserve the
> > deposited articles over time.
> So what? That will take care of itself, with time. What won't, is OA
> itself. So let OA stay focused on providing OA, not veer off into the
> irrelevance of preservation archiving.
> (To ward off the inevitable torrent: Yes, of course OA content is being
> preserved too -- otherwise the (little) stuff that authors had the good
> sense to self-archive 20 years ago would not still be with us, and still
> OA, today. And of course IRs can and will take care of preserving their
> content. What they need, urgently, is that content, which authors are
> not yet providing. Not publishers' permission to preserve, which is an
> utter red herring.)
> > The only way one can ensure that
> > one's deposited information might be available over time is to
> > use one of the author's addenda (or re-write the publisher
> > contract).
> The best way to ensure that it is accessible, and usable, today, is to
> self-archive it. Worry about preservation once the content's up there
> (and if/when it's the only version afloat). Not now.
> > So there is an immense difference in terms. Self-archiving, open
> > access, and institutional repositories denote computer systems
> > that facilitate near-immediate access to writings. Trusted
> > Digital Repositories (aka "archives") are established, funded,
> > and have the necessary legal, technical, and administrative
> > capabilities to maintain digital information over time in either
> > a closed or open system.
> Yes; and let us focus on author-version self-archiving and IRs for OA --
> and publisher version archiving and TDRs for preservation.
> > The problem with the language is that
> > the use of the term "archive" in "self-archiving" implies to many
> > that the TDR requirements are being met - when instead, in
> > reality, access is guaranteed only as long as the "self-archives"
> > does not have to make a copy of the original work.
> Actually, neither OA self-archiving nor preservation archiving means
> much to much of anybody, since so little of either is actually being done
> today. But it seems to me that we can see and understand the difference in
> the target content and the agenda, once it's pointed out, without having
> to submit the locution "archiving" to any Solomonian slicing. It's just
> normal polysemy...
> > If one wants
> > an article to be permanently available, one has to secure the
> > necessary right to do so from the publisher and find a IR that is
> > committed to becoming a TDR - or rely upon the publisher to take
> > advantage of initiatives such as PORTICO and LOCKSS to ensure
> > that access (open or otherwise) will exist over time.
> Indeed. And let those who are fussed about that, devote their efforts to
> making sure that the official versions of all 2.5 million annual published
> articles in all 25,000 peer-reviewed journals are permanently available
> by devoting themselves to TDR archiving.
> And let those who are fussed about the needless daily, weekly, monthly,
> yearly loss of research usage and impact from which research is currently
> (anosognosically) suffering, devote their efforts to making sure that
> the author's versions of all 2.5 million annual published articles in
> all 25,000 peer-reviewed journals are at long last self-archived (sic)
> in their authors' institutions' IRs.
> Amen,
> Stevan Harnad
> > Peter Hirtle
> >
> > On 8/12/07, Stevan Harnad <> wrote:
> > >
> > > On Mon, 6 Aug 2007, Peter Hirtle wrote:
> > >
> > > > I for one am in agreement 100% with Sandy Thatcher on this. We
> > > > already are suffering confusion because of the ill-advised
> > > > decision to use terms like "self-archiving" and "open
> > > > archive," both of which have nothing to do with archives or
> > > > the permanent retention of knowledge.
> > >
> > > Both terms were perfectly fine for providing online access
> > > (permanently, of course).
> > >
> > > But "open archive" then went on to denote OAI-compliant and
> > > interoperable, but not necessarily Open Access, so "Open
> > > Access" was needed as an extra descriptor. "Repository" was
> > > (and is) of course entirely superfluous ("archive" would have
> > > done just fine), but now "Institutional Repository" has
> > > consolidated its supererogatory niche, so OA IR is what we have
> > > to make do with.
Received on Fri Aug 17 2007 - 02:48:32 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:01 GMT