Re: Self-Archiving and Journal Subscriptions: Critique of PRC Study

From: Stevan Harnad
Date: Fri, 17 Nov 2006

On Thu, 16 Nov 2006, Simon Inger and Chris Beckett wrote:

> 1. The methodology deployed and the entire point of conducting a
> conjoint survey at all:
> We decided to undertake a conjoint survey because we felt that other
> attitudinal surveys of what future intentions might be were highly prone to
> being bogged down exactly because surveyees were asked in absolute terms to
> what extent they would like one scenario, and then another, without ever
> asking them to choose between them.

Simon and Chris are, I think, quite right that there is considerable danger
of bias, in one direction or the other, when acquisitions librarians are asked
to speculate about what they would do in hypothetical future scenarios.

But it is not at all clear that the method Simon and Chris used
corrects for these biases, or merely changes the subject (from
predicting cancellations under hypothetical conditions, to merely
expressing product/property preferences under hypothetical conditions).

> A survey that asks people if they like
> steak to eat, and then asks if they like chicken to eat, is not as powerful
> as a survey that asks them to choose between steak and chicken. Bring in
> another variable, such as, "how well done do you like your meat?" and you
> get a very different answer depending on whether the surveyee preferred
> steak or chicken in the first place. By combining these factors with others
> through a conjoint survey, you might just find out how bad the steak has to
> be before chicken tartare starts to command a market share! We hope this
> illustrates the whole purpose of the conjoint in applying it to the
> situation that publishing currently faces; it forces people to reveal the
> true underlying factors in their decision-making in a way that hasn't been
> done before.

The conjoint method is no doubt a good method for estimating or ranking
relative product property preferences in general. But in the particular
case of library journal acquisitions/cancellations, OA and self-archiving,
as noted, the method not only does not remedy the the possibility of
bias, but it bypasses the question of cancellations altogether -- the
question that I take it that (for lack of actual cancellation data)
the survey was trying to answer.

> 2. Whether or not OA can be considered a product in any meaningful sense:
> . Can articles in Open Access repositories be considered a product and
> one that librarians may select instead of journals? Absolutely they can. Is
> the issue here that they are free via OA, or that they are not organised and
> packaged? If we were to stand on a street corner and give away mobile
> phones, they would be every bit as much as a product as one you paid for in
> a shop. Would we cause some people not to go into the shop and buy a mobile
> - sure we would. Would some people not trust the mobile we gave them and buy
> one anyway - yes they would. Would some people use our mobiles as a spare
> and buy another anyway - yes they would do that too. A survey might tell you
> in what proportions people would undertake these actions. But you can be
> certain that at least some of the people would use the mobile we gave them
> and postpone or cancel the acquisition of a paid-for phone. So we believe
> that articles via OA, even though they are free, are still very much a
> product. So perhaps they should not be considered as a product because they
> are not organised into product-shaped offerings, like journals are.

I'm afraid I cannot agree with this reasoning: The mobile phone analogy
(as well as the meat analogy) begs the question, because in both cases
the product and the client are unambiguous, and it is a straightforward
quid pro quo: Would the client rather buy steak or chicken? mobile
phone or home phone? The choice is a direct trade-off between (two)
competing products. And I also agree that if one of them were free,
that would not change anything: It would still be this versus that.

But that's not at all how it is with paid journals vs. self-archived
OA content.

Let's start with an easy example: Suppose we weren't talking about
anarchically self-archived articles, but about OA vs. non-OA journals. And
to make it even simpler, let us suppose (as is the case with, for example,
with BioMed Central journal institutional "memberships"), that a library
has a choice between two journals that are equated, somehow, in terms
of readership, quality, subject-matter and usage-needs of institutional
users, that there is only enough money to afford one of them, and that
they differ in that one is subscription-based and the other is based on
institutional "membership" fees (for publishing institutional articles).

That's an odd choice situation for an acquisitions librarian (since in
one case the librarian is buying in the journal's content, and in the
other the librarian is paying for the institution's own outgoing content),
but perhaps librarians would intuit that they get better value for their
institutional money from the second journal (especially if they consult
with their institutional users, and they agree -- a detail not mentioned
by the survey, which seems to assume subscription/cancellation decisions
are all or mostly in the hands of the librarians!).

But that would be a prima-facie plausible prediction by librarians, about
what they would prefer and do under those conditions. Even more plausible
would be a least/most choice involving *three* equivalent journals, when
the library can afford only two journals, and the third is an OA journal
for which someone else (other than the library) pays the institutional
OA charges, making it effectively "free" to the library. Under those
conditions the librarian could realistically say they'd prefer to "cancel"
the free (OA) journal (i.e., just let users download it for themselves,
free, from the web) so they can use all available money saved for the
other two journals.

(Of course, the tricky part is that a pure OA journal [e.g., BMC
or PLoS] is not one that a library subscribes to anyway! (Actually,
most OA journals *are* available for subscription, and do not charge
author-institutions for publication. Possibly, just possibly, the results
of the PRC survey might have some predictive value as to whether *that*
kind of OA journal is likely to be cancelled; but so far there is little
actual evidence of that happening either, though it might! Keep your
eyes on the longevity of the majority of the OA journals in DOAJ that do not
change for publication but make ends meet from subscriptions.)

But we have not yet come to third option, the one that the survey was
commissioned by PRC to test, and that is author self-archiving, and
whether that will cause cancellations.

It is for author self-archiving that the question of the extra properties
of percentage content, and length of embargo had to be introduced and
varied in this study. Length of embargo is not the problem, but percentage
content very much is, and so is the fact that all self-archived content is
free. Here we are square in the middle of the profound difference between
OA journals (a complete, quid-pro-quo product) and OA self-archiving
(an anarchic process, applying to only a portion of content, and an
unknown proportion at that, growing -- but again at an unknown rate --
across time).

With journals (including OA journals), it's journal X vs journal Y
("product" X vs. "product" Y): Shall I purchase X and cancel Y,
or vice versa? Shall I purchase X and Y and cancel Z? These are
presumably familiar, hence realistic acquisitions librarian questions
(in consultation with users -- who were not surveyed in this survey!).

But what is the question with journals vs. anarchic self-archived
content? What is it that a librarian is contemplating buying versus
cancelling when what they are really faced with is a choice between a
journal and a distributed, anarchic and uncertain percentage of its
contents (with no indication of how it is even knowable what that
percentage is)?

But let's overlook that and agree that if it were a question of buying
vs. cancelling journal X based on some estimate of the percentage of its
contents that is available for free in self-archived form, librarians
could dream up a hypothetical preference from a combination of properties
such as journal quality, journal price, percentage free content, and
embargo length.

But that would be journal X vs. not-X, or journal X vs. Y. What is the
librarian's conjecture as to their preference when *all* journals have
PP% of their content self-archived? That's not a journal vs. journal
acquisition/cancellation question any more: It's asking librarians to
second-guess the OA future: Are we to infer from the conjoint preference
data that they would cancel *all* journals under those conditions
(second-guessing their users on how long they might, for example,
continue to value the paper edition?).

The analogy with chicken and steak would be whether conjoint chicken/steak
or mobile/home-phone property preferences predict whether and when people
would stop paying for food or phones altogether because they were somehow
miraculously available free with a certain probability (and/or) delay)
for a certain percentage of the potential calls and time. We *know*
that if it were *all* free, immediately and with certainty, everyone
would prefer that. But do conjoint preferences tell us one bit more
than that? (And again we leave out the parties of the second part --
the institutional users - as well as the paper edition and how they
might feel about it, and for how long...)

> That may be so, for now, but at the same time we are aware of organisations
> that are building products which combine the power of OAI-PMH (and the
> power of Google); existing abstracting & indexing databases; publisher
> link servers; and library operated link servers: to build an organised route
> to OA materials - a route that would allow a non-subscriber of a journal
> article to be directed to the free OA repository version instead. Once these
> products exist we are sure our research indicates that *some* librarians at
> least will actually switch to OA versions for *some* of their information
> needs, while others will continue to purchase the journal product for a
> whole raft of reasons and others will provide, i.e. acquire, both options.

Let me quickly agree about what I would not have contested from the very outset:

(1) Without the conjoint survey, I would already have agreed that everyone
prefers to have something for free rather than paying for it.

(2) I also happen to believe, personally, that once 100% OA self-archiving has
been reached -- but I don't know how soon it will be reached, nor how soon after
it is reached this will happen -- there will be cancellation pressure that will
lead to downsizing and a transition to OA publishing.

But it is still a fact that there is as yet no evidence of cancellation
pressure, and I do not at all see how the conjoint preference study tells
us any more than we already know (and don't know) about whether and when
and how much cancellation pressure will ever be caused by self-archiving.

(I have to add that I profoundly doubt that in the OA world libraries and
librarians will mediate in any way between users and the refereed journal
article literature. Library mediation will be as supererogatory as it is
with what users do with google today.)

> 3. The issue of bias:
> The whole Open Access debate evokes an emotional response from
> publishers, librarians and researchers on both sides of the debate. At the
> same time, so does the word "cancellation". For that matter, so does the
> phrase "serials crisis". We wanted to avoid using all of these phrases in
> the research so as not to cloud people's judgement in favour of their
> beliefs alone. This is one way of avoiding one type of bias. Specifically
> the type of bias we sought to eliminate was an emotional bias, not a bias
> for or against OA per se. It can be equally well argued that another survey
> should be done with these words actually mentioned. The results may well be
> different. But no more or less valid than ours - such a survey would be
> measuring a different thing. It is up to each individual reader of the
> report to decide which kind of response and hence survey they would prefer.

I think the attempt to avoid all of these emotional (and notional)
biases was a commendable one, and it would have been successful too,
if the conjoint-preference method had been amenable to analysing the
anarchic phenomenon of author self-archiving and its likely effect on
librarian acquisition/cancellation. But it is not, because anarchic,
blanket self-archiving is simply not an acquisition/cancellation matter.

Acquisition/cancellation concerns what to buy, retain and cancel from
among a finite set of products using a finite acquisitions budget. It
is a competitive matter: competition between products. Anarchic
self-archiving is gradual and uncertain, but it generates only an
all-or-none cancellation question, and one that is in no way addressed
by the conjoint preferences method.

(I am sure, by the way, that librarians could have been polled -- directly
and unemotionally -- about how much journal content they thought would
have to be self-archived before they would no longer need to purchase
journals at all -- but I don't think their speculations on that would
have been very informative.)

I do think, though, that one indirect finding on this question did
emerge from the conjoint method (and it surprised me, considering how
strident some librarians have been in the opposite direction in the
past!): It does seem that librarians are surprisingly indifferent to the
difference between an author's refereed final draft and the publisher's
PDF. That's very interesting (and it's progress: in librarian awareness
and understanding of what researchers really do and don't need!).

> 4. The statement of apparently obvious or banal findings:
> The critique states that some of the findings are obvious and banal.
> "The fact that everyone would like something for free rather than paying for
> it", for example. In fact the survey shows that not everyone would prefer
> that. Even in a completely like for like situation. Possibly because people
> are suspicious of free things.

Agreed. (But that's hardly very surprising either! Nor informative about
whether and when self-archiving causes cancellations.)

> Much more important, however, is how the
> decision becomes qualified by other factors - *and to what extent* they are
> qualified. (Would you like free raw chicken for dinner or paid-for cooked
> chicken?) Look closely and the results show that the lure of "free" has only
> so much pulling power, and a combination of other factors pull more potently
> against it. So in themselves the importance of each of the attributes has
> limited value - it is in combination that their true meaning comes through.

I think what you are saying here is that in varying the combination of 6
properties, each with 3-4 possible values, you founded a complex preferential
structure. But it still doesn't tell us whether and when self-archiving will
cause cancellations.

> 5. The validity of inferring cancellation behaviour from the findings:
> So, can we infer cancellation behaviour from the results? Yes, we
> can. Because it is unrealistic to expect that everyone that expresses a
> preference for acquiring a product that looks very much like content on OA
> repositories would still continue to acquire a paid-for version. Some will,
> of that we have very little doubt. But likewise some won't. To that end I
> think we *can infer cancellation will occur*. It may be after someone has
> provided an organisational layer on top of the repositories. It may be after
> improved librarian awareness of the alternative has occurred. And it may
> require way more than 15% of the material to be available on OA.

For those (like me) who happen to think that 100% OA self-archiving is
likely eventually to cause cancellations, downsizing, and a transition to
the OA cost-recovery, but that there is as yet no evidence of this, and
that it is a matter of complete uncertainty how fast the self-archiving
will grow, how soon the cancellation pressure will be felt, and how
strong the cancellation pressure will be -- this study did not provide
any new information.

For those empiricists (for whom I have some sympathy too), who simply say
there is no evidence at all yet that self-archiving causes cancellations
-- and that even in the few fields where self-archiving has been at
or near 100% for some years there is still no such evidence -- it is
likewise true that this study has not provided any new evidence: neither
about *whether* there will be cancellations, nor, if so, about when and
how much.

Stevan Harnad
