Re: preservation vs. Preservation (fwd)

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Tue, 7 Mar 2006 13:55:24 +0000

I am forwarding Les Carr's wise and witty reply to Charles Oppenheim.
It was posted to JISC-REPOSITORIES but the thread is also playing out
in part on the AmSci Forum. Les is the head of the JISC PRESERV
Project http://preserv.eprints.org/ -- SH
[Ceterum censeo, in the special case of OA self-archiving we are talking
about *supplementary author drafts*, not the original, published articles
themselves (whose Preservation Problem is in the hands of their Publishers
and Subscribers, not their authors and their institutions), and, more
important, we are talking about *why* authors should/would bother to
self-archive such OA supplements at all.]

---------- Forwarded message ----------
Date: Tue, 7 Mar 2006 09:18:03 +0000
From: Leslie Carr <lac_at_ecs.soton.ac.uk>
To: JISC-REPOSITORIES_at_JISCMAIL.AC.UK
Subject: Re: preservation vs. Preservation

On 6 Mar 2006, at 20:02, Charles Oppenheim wrote:

> Stevan, I self-archive to ensure my articles are widely read now
> AND WILL BE FOR THE FORSEEABLE FUTURE. The former without the latter is just
> nonsensical.

    [[ Executive Summary: Charles' requirements seem already to be
    satisfied by his self-archiving and by some common repository
    management practice. But that only applies to the articles that he
    has self-archived. ]]

The truly forseeable future is a very short timeframe indeed, but it
is interesting that you chose that phrase against "in perpetuity" or
"for the long term" which seem to be used in this context as a vague
indicator of some unspecified future time more distant than the
collapse of Western civilisation (by analogy with the Library of
Alexandria). It would be very interesting to see some thoughtful
descriptions of reasonable timeframes for accessibility and hence the
preservation processes that are likely to come into play.

A contributor indicated below that he was worried that material in
a repository might become inaccessible after 24 hours or 3 years. I
would guess that the former timescale would be most likely due
to service instability (our data centre just burned down) and the
latter to institutional instability (we've changed our minds about a
repository). It is also possible that 3-year inaccessibility could be
the result of 'format inaccessibility' if the format was made up by
the researcher (or project) responsible for a piece of data (I can't
remember how to interpret the contents of this file OR the person who
created the file has now left the institution).

On reasonable timescales (ie greater than 3 years) I would have to
cite arxiv.org in evidence. The earliest papers there (now 16 years
old) are still accessible (see http://www.arxiv.org/list/astro-ph/
9204 ).

Other timeframes for consideration might be as follows:

  a) the download lifespan of your article which elapses once your
  document hasn't been read for a whole year (or decade).

  b) the citation lifespan of your article which elapses once your
  document hasn't been cited for a whole year (or decade).

  c) the relevance lifespan of your article (perhaps as defined by you,
  the author)

  d) your career lifespan which elapses once you retire

  e) your lifespan (I needn't elaborate)

  f) your institution's lifespan (but surely institutions are created,
  not destroyed?)

  g) an arbitrary long-sounding period e.g. 100 years

  h) a statistically-defined period based on some observable feature
  of the literature or of its use

  i) the economic lifespan of your article, which elapses when it
  becomes too expensive to provide access to it

  j) some combination of the above.

The JISC PRESERV project http://preserv.eprints.org/ has been undertaking
some work on defining preservation criteria in terms of citation
lifespans: it is interesting to note that the earliest arxiv paper in the
above list (Gamma-Ray Bursts as the Death Throes of Massive Binary Stars,
Astrophys.J. 395 (1992) L83-L86) is still receiving approx 5 citations
a year and around 10 downloads a year from the UK arxiv mirror.

By self-archiving your papers I'm sure that we would all judge that you
have made the first step towards satisfying your goal of "ensur[ing] my
articles are widely read now AND WILL BE FOR THE FORSEEABLE FUTURE". By
self-archiving them in a reasonable format (you seem to have chosen
PDF for all your deposits) you have made the decision to use a
format which is widely accessible, well supported, publicly documented
and has many different renderer implementations, both commercial and
open source. This seems to be an excellent basis for accessibility and
near-to-mid-term preservation and is reflected in PDF's choice of support
by many preservation-oriented repositories including MIT. Combined with
some simple and fairly low-impact technical support from the repository
or its service providers then we might reasonably expect access to your
articles to be preserved into the longer term, whatever that turns out
to be.

So I don't see any problem. You seem to have made all the required
steps, and as long as Loughborough's repository is managed as
responsibly as we would expect then your objectives would already
seem to be satisfied. In other words, best OA practice (as seen in
many OA repositories) naturally facilitates preservation (with any
capitalisation that you choose).

Actually, there is a bit of a problem, and I hope you'll forgive me
for pointing this out. The repository only contains one of your
papers from 2005, so the remainder from last year don't yet have the
same guarantee of preservation that OA practice provides :-)

Les Carr

PS Your institutional repository ( http://magpie.lboro.ac.uk/
dspace/ ) doesn't yet make any preservation commitments for specific
formats, including PDF. See http://magpie.lboro.ac.uk/dspace/help/
formats.jsp . But I'm sure that these will appear in time.

> Quoting Stevan Harnad <harnad_at_ECS.SOTON.AC.UK>:
>
>>
>>> it... seems to make little sense to go to the effort of making
>>> information accessible NOW when it could theoretically be
>>> inaccessible
>>> 24 hours from NOW or even 3 years from NOW...
>>
>> Please refer to Steve Hitchcock's posting about PRESERV.
>>
>>
> http://listserver.sigmaxi.org/sc/wa.exe?A2=ind06&L=american-
> scientist-open-access-forum&D=1&O=D&F=l&P=14808
>>
>> As I said from the outset, Eprints and OA are of course (quite
>> naturally
>> and without fanfare) attending to small-p preservation (as has Arxiv,
>> since its inception in 1991, and CogPrints since its inception in
>> 1997
>> -- note that all their contents are still here, with us, in 2006, in
>> continuous use, again without any fanfare about large-P
>> Preservation).
>> But Preservation is not why they were self-archived!
>>
>> The point is simple: Preservation is *not* the reason researchers
>> self-archive their postprints, which are final, refereed drafts of
>> their published articles. Maximising their accessibility and their
>> impact is the reason researchers self-archive their postprints. It
>> is not those self-archived supplements that require the large-P
>> Preservation, it is the published originals.
>>
>> If researchers self-archive at all, they do not do it in order to
>> Preserve
>> their articles; they do it in order to increase their article's usage
>> and impact. And only 15% of researchers as yet self-archive. The goal
>> of OA is to raise that to 15% to 100%. Neither the silly
>> suggestion that
>> authors should self-archive in order to Preserve their articles --
>> nor
>> any
>> extra work or complications anyone foolishly adds to the self-
>> archiving
>> procedure (such as it is, for example, in Eprints IRs today) in the
>> interests of Preservation -- will do anything to help raise that 15%
>> to 100%: On the contrary, a bad reason for self-archiving and
>> needless
>> extra work in self-archiving will only deter self-archiving. And
>> neglect
>> of OA for other archiving priorities (e.g., Digital Preservation) are
>> the worst.
>>
>> At the same time, articles in OA IRs *are* being small-p
>> preserved, as
>> noted. So that's not a substantive issue either.
>>
>> The only substantive issue is how to fill OA IRs with 100% of
>> institutional OA article output, as soon as possible. (It's already
>> vastly overdue and substantial research impact and progress continue
>> to be needlessly lost till it happens.)
>>
>> I have listed many heroic librarians who understand this fully, and
>> have been at the forefront of OA efforts and success (e.g., Paula
>> Callan, Helene Bosc, Eloy Rodrigues, Derek Law, Susanna Mornati,
>> and many, many others). But there are also many in the library
>> community
>> who are ignorant of or indifferent to OA, and have other ideas about
>> what to do with IRs. Several are discussed in Richard Poynder's
>> insightful analysis. And it is a parting of ways with them that
>> Richard was proposing to the OA movement (and he may well be right).
>>
>>
> http://poynder.blogspot.com/2006/03/institutional-repositories-and-
> little.html
>>
>> Stevan Harnad
>>
>>
>>> Hi John,
>>>
>>>> All this has nothing to do with making
>>>> information accessible NOW. You have failed to distinguish between
>> present
>>>> and future accessibility.
>>>
>>> The point I was making is that the differentiation between 'present'
>> and
>>> 'future' accessibility is bogus - there no longer is any real
>> difference.
>>> And if there is no longer a difference, then the proponents of
>>> present
>>> accessibility should probably be considering future accessibility
>>> as a
>>> matter of course.
>>>
>>> I'm sure most will continue to treat such matters as 'a horses for
>> courses'
>>> situation, like you say. However, it just seems to make little
>>> sense
>> to go
>>> to the effort of making information accessible NOW when it could
>>> theoretically be inaccessible 24 hours from NOW or even 3 years from
>> NOW -
>>> and when some simple technical and administrative measures could
>>> have
>> been
>>> taken to prevent any consequent inaccessibility. It is also
>>> appears to
>> be
>>> inconsistent with Stevan Harnad's definition of 'immediate access',
>> which
>>> suggests that information be accessible "today, tomorrow and into
>>> the
>>> future".
>>>
>>> Regards,
>>>
>>>
>>>> -----Original Message-----
>>>> From: J.W.T.Smith [mailto:J.W.T.Smith_at_kent.ac.uk]
>>>> Sent: 03 March 2006 17:29
>>>> Cc: LIS-ELIB_at_JISCMAIL.AC.UK
>>>> Subject: Re: preservation vs. Preservation
>>>>
>>>>
>>>>
>>>> Comments below.
>>>>
>>>>
>>>>> John,
>>>>>
>>>>>> Preservation and access are two different things.
>>>>>
>>>>> I have to disagree. Preservation is inextricably linked with
>> access.
>>>>>
>>>>> To state that 'preservation and access are two totally different
>> things'
>>>> is
>>>>> - I find - a common misconception.
>>>>
>>>> I don't suffer from common misconceptions, but I am sometimes
>>>> misunderstood.
>>>>
>>>>> Preservation (with a capital P) is not
>>>>> merely about preserving digital objects for posterity as an end in
>>>> itself
>>>>> (which is, of course, important); it is about preserving the
>> digital
>>>>> integrity of the object(s) so as to ensure it remains *accessible*
>> ad
>>>>> infinitum.
>>>>>
>>>>> Robust Preservation strategies always ensure sufficient
>> administrative
>>>>> metadata (technical metadata, rights metadata, etc.) is recorded
>> because
>>>>> without it, user access can theoretically be jeopardised at *any*
>> point
>>>> in
>>>>> the future. The rate of technical and software obsolesce is such
>> that
>>>>> deposits made to IRs today could - theoretically - be inaccessible
>> in
>>>> five
>>>>> years. Preservation is no longer some triviality that can be
>> addressed
>>>> far,
>>>>> far in the future my 'someone else'. IR administrators /
>>>>> libraries
>> have
>>>> to
>>>>> be in a position to regularly migrate or refresh materials to
>> preserve
>>>>> continued user access. Their ability to do so is predicated on
>>>> preparing
>>>>> suitable Preservation strategies.
>>>>>
>>>>> Thus, to suggest that Preservation entails 'limiting' or
>> 'screening'
>>>> access
>>>>> is - in my opinion - to entirely misinterpret the purpose of
>> digital
>>>>> preservation. If efforts at attaining '100% OA via 100%
>> self-archiving'
>>>> are
>>>>> not to be in vain, the need for Preservation (with a capital P!)
>> should
>>>> not
>>>>> be pooh-poohed.
>>>>
>>>> I did not "pooh-pooh" anything. What you say is true but it is not
>>>> relevant to what I wrote. All this has nothing to do with making
>>>> information accessible NOW. You have failed to distinguish between
>> present
>>>> and future accessibility.
>>>>
>>>> To clarify, for the here and now, I believe Preservation is not the
>> same
>>>> thing as making accessible and those whose main interest is
>> accessibility
>>>> NOW should not spend too much time on worrying about Preservation.
>> Now [at
>>>> this time, currently] PDF is an excellent way of making information
>>>> available, but I would not suggest it as a preservation format.
>>>> Since
>>>> there has been a prevalence for poor quality metaphors/analogies in
>> this
>>>> discussion I could say this is a horses for courses situation.
>>>>
>>>> Regards,
>>>>
>>>> John Smith.
>>>
>>
Received on Tue Mar 07 2006 - 14:08:57 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:14 GMT