Re: subject classification

From: Stevan Harnad <>
Date: Wed, 25 Jun 2008 11:35:15 -0400

Arxiv has been up for 17 years and its users have been doing the keystrokes
spontaneously: No keystroke inertia, not need for mandates.

But this spontaneous key-stroking (which has also been there among computer
scientists and economists) has, in 17 years, likewise failed utterly to
generalize to the rest of the scholarly scientific community.

That is why it is more keystroke mandates, not more keystrokes, that are
urgently needed for the vast non-keystroking majority today.

Once they have caught up with the physicists, computer scientists and
economists, we can talk about adding more keystrokes.

If Arxiv had had to face keystroke inertia too, I am sure it would have
happily sacrificed the bit of extra functionality a few classification
keystrokes might provide, for the monumental functionality provided by
having the full-texts content itself!

Stevan Harnad

On 25-Jun-08, at 10:35 AM, Simeon Warner wrote:

> Andy Powell wrote:
> > I'd therefore be tempted to re-ask your question in a slightly
> > different, two-part, form:
> > 1) is there any evidence that the value of manually assigning subject
> > classifications to open access scholarly publications improves scholarly
> > communication sufficiently over full-text indexing approaches to
> > outweigh the costs of doing so? (My answer: almost certainly not).
> While I mostly agree that the focus of effort should be on automatic
> classification, I think arXiv serves as an example of the use of a
> manual classification which has high value. The author-supplied
> classifications have driven alerting for 17 years now and I think have
> been important in acceptance of arXiv through the fostering of a sense of
> community.
> We also use the author-supplied classification to direct new submissions
> to appropriate moderators. We are currently experimenting with the use
> of an automatic classifier to alert administrators to possible
> mis-classifications, and later to suggest classifications to submitters.
> Our
> (positive) experience from extraction of articles from the existing corpus
> to
> seed the quantitative biology category (q-bio) was positive and is
> described
> in . It may be that at some time arXiv
> could
> do away with the manual classification but it may have lasting value in
> community building, in providing the user with a sense of agency, and as a
> double-check.
> One aspect of automatic classification we should not forget is that one
> can rerun over the whole corpus at any time -- something simply
> impractical
> with manual schemes. Thus it can be expected to cope with changes in a
> classification scheme as subjects evolve, or provide different views for
> different user communities (given an appropriate training set).
> Cheers,
> Simeon
Received on Wed Jun 25 2008 - 16:49:37 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:21 GMT