RE: [ozeprints] Subject terms and repositories

From: Arthur Sale <>
Date: Fri, 3 Mar 2006 16:38:48 +1100

Belinda and others

I have long held the view that the main purpose of subject
classifications in Open Access Repositories was to satisfy people other
than searchers - in other words archivists and DEST classifiers.  Let's
face facts. Almost no researcher in the world is likely to search the UQ
repository using its own search engine, except UQ researchers. They will
discover your content via OAI harvesters and gateways, or via crawlers
and search engines (highest probability). Keywords play only a very minor
role in this, and subject classifications even less. If you look at the
access logs of the UQ repository I am very confident that you will find
that the majority (>80%) of external entrants to the repository enter via
a search engine, gateway or direct link to a document. This is my
experience at UTas.

As to ASRC, its use as a primary classification serves one and only one
purpose: to satisfy DEST and ABS. One can say that because of this the
researchers know it, but this is quite clearly derivative. The rest of
the world (even our close neighbour New Zealand) knows nothing about ASRC
and if they ever see it will obviously ignore it or be irritated by such
a parochial classification and such limited vision. I only mention in
passing that ASRC is also hopeless deficient and out-of-date in
identifying many current fields of research - you and I have corresponded
on bioinformatics, e-research, metabolomics, etc before.


Repository documents ought to have a subject classification, and this
must be an internationally understood one, say LC or Dewey. This should
be standard in software releases; UQ&#8217;s Fez team please note because
you don't do this.

There is no harm in having ASRC additionally in Australia, but it must
definitely be an add-on, not a replacement. Exactly how this is done is
an implementation issue, and there are several ways to achieve the aim,
in the interests of DEST and RQF reporting. But we mustn&#8217;t tolerate
the RQF tail wagging the Open Access dog.


> -----Original Message-----

> From:

> On Behalf Of Belinda Weaver
> Sent: Friday, 3 March 2006 14:57
> To:
> Cc:
> Subject: [ozeprints] Subject terms and repositories
> Dear all

> I received a copy of an email today where open access advocate Steven Harnad
> appeared to rubbish the idea of controlled subject terms being used (or even
> useful) for repositories. I thought I might circulate his remarks and my
> response to them to this list as I think subject terms are an important part
> of making repository content visible, useful and retrievable.
> Does anyone else want to comment on this? Given that many institutions are
> on the brink of launching repositories, I thought this might be a handy
> discussion to have now.
> His comments were:
> > Subjects? Subject search?
> > What utter flotsam! A 1980's library-aided Dialog search hang-over.
> > (In the online/OA millennium, it is Google-style full-text Boolean
> > search, not subject search any more. Completely obsolete;  alive now
> > only in library cataloguers'  imaginations. (Also, no one ever made
> > subject catalogue cards for journal articles!)
> In reply I said:
> I would take issue with this statement - a very large number of searches on
> our ePrintsUQ repository are by (controlled) subject terms. The thesaurus we
> use, the Australian Standard Research Classification, is well known to
> Australian academics as it is used for reporting publications annually to
> DEST, when applying for ARC grants and so on. Therefore when looking for
> materials, many people follow that approach of going directly to the ASRC
> code that interests them. That way they get publications that are
> specifically and substantially about, say, signal processing rather than
> items that might simply mention signal processing in passing.
> Keywords go in and out of fashion. Repositories that rely on keyword search
> alone will quickly fill up with 'dark matter' - where no search on current
> keywords actually retrieves certain records that nevertheless may be
> relevant. Controlled subject search is actually far more efficient in
> finding items in a database than full text searching by keyword. Keyword
> searches return many results, many of borderline relevance or no relevance
> at all. Subject searches result in fewer, more targeted results. Material
> that is not subject-classified may end up being virtually lost. An example -
> the book 'What colour is your parachute' is about career planning. It is
> classified in our library catalogue with terms such as career planning,
> jobs, and so on. Without that classification, who would find that title as
> the relevant keywords a searcher would use do not appear anywhere in the
> title. Journal articles in the arts are full of titles that bear little
> relevance to their contents. Subject classification helps locate those.
> In any case, we offer both - controlled subject searching, keyword
> searching, so users can choose. Why not offer multiple pathways to data?
> There seems little harm in that. And while card catalogues for journal
> articles might not have existed, almost all journal databases do use
> controlled subject headings to facilitate searching, and this is the method
> I always use in journal databases - keyword search first to identify
> possibly useful records. When one record seems particularly good, I extract
> the subject terms allocated, and re run the search on that term, limiting to
> that field. I always find records that keyword search did not retrieve and
> the records I do get are much more substantially on topic than the keyword
> alone results.
> regards
> Belinda
> Belinda Weaver,
> Coordinator, ePrintsUQ,
> The University of Queensland Library,
> The University of Queensland,
> Brisbane Australia 4072.
> T: +617 3365 8281
> F: +617 3365 7930
> E:
> W:
Received on Fri Mar 03 2006 - 21:06:33 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:13 GMT