Re: Optimising Deposit and Import Capabilities of EPrints

From: guedon <>
Date: Fri, 18 Aug 2006 10:13:29 -0400

(was Re: measuring affiliation)

    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "iso-8859-1" character set. ]
    [ Some characters may be displayed incorrectly. ]

The author disambiguation is indeed a really important issue. It affects
all kinds of things, ranging from the Science Citation Index to even
some commercial offerings. For example, while searching an author in a
Springer journal the other day, I noticed that their own search engines
distinguished between the author's name with full first name from the
same author's name with just the initial... I had to search through two
lists of articles instead of one.

I believe that scientific and scholarly authors ought to be given a
permanent identifier which ought to accompany their publication in any
journal that carries peer review. In effect, it would be the equivalent
of an ISBN.

The easiest way to begin implementing this PAI (Permanent Author
Identifier) might be for a group of journals to come together and agree
that when a paper is submitted, the author must supply his/her permanent
identifier. If he/she does not have one, indicating so would mean that
the cooperating publisher would assign one immediately and would place
it in an open database. Universities could encourage their students to
take up such an identifier as soon as these are on a track (e.g.
doctoral studies) that should lead to some publishing.

In conclusion, I do not claim to have clear strategies about this PAI,
but the need for one appears very high to me. In particular, it would be
very useful for institutional repositories and the OA movement in

Google and other large search engines might be interested in supporting
such a development. It would greatly enhance the capability of Google
Scholar. Countries that do not use the Latin script or use it with funny
diacritical marks (as in Guédon) might also find it useful to have their
scientists unambiguously visible in the whole world, even though this
might decrease the number of "scientists" for any given country.



Le vendredi 18 août 2006 à 08:51 -0400, Timothy Miles-Board a écrit :
> The EPrints team have been looking at this issue in some detail. The current
> version of EPrints has "clone" and "new version" options which save having
> to re-enter metadata for similar/different versions of an existing deposit.
> However, this doesn't help much if you are starting a new deposit. The
> approach we've been favouring of late is auto-completion (like Google
> Suggest, whereby the depositor begins typing
> the first few characters of the name of a co-author and is presented with a
> pop-up list of suggestions. The behind-the-scenes logic that determines what
> to suggest can be customised to an individual repository's requirements e.g.
> suggest from the list of registered users, suggest by looking up in the
> institutions user account (e.g. LDAP) server, suggest according to an
> internal database list of institutional and non-institutional users. The
> previous deposits that you have made can also inform the list of suggestions
> e.g. frequent/recent co-authors can be promoted to the top of the list of
> suggestions.
> This is not just about minimising keystrokes - the suggestion mechanism we
> implemented is also able to carry additional data about the authors being
> suggested. You mention the potential for cross-linking an author's work
> between archives. In order to do this you need to be able to uniquely
> identify them. Author disambiguation is potentially important for the
> Research Assessment Exercise (RAE) in the UK. When an author's name is
> autocompleted, the ID of that author is also attached.
> We have also successfully applied the auto-completion technique to keywords
> and journal names (with the ISSN number of the journal being passed with the
> suggestion and used to auto-fill the ISSN field upon selection of the
> intended journal by the user).
> Although for the moment we've decided not to include it in the next version
> of EPrints (3.0), it will be in a future version. In the meantime, I'd be
> happy to describe our technique in more technical detail on the
> wiki if that would be useful (creating an autocompleting field in the
> EPrints deposit form using an open source AJAX library is straightforward-
> the complicated bit comes in designing the (independent) program that makes
> appropriate and useful suggestions in reponse to the user's keystrokes).
> It is also worth noting that EPrints 3.0 will have a number of new options
> for importing data e.g. users can create new deposits by cutting and pasting
> BibTeX/EndNote/etc entries from a bibliography file into a textbox and
> hitting a button.
> Tim
> --
> Timothy Miles-Board
> EPrints Services
> Southampton, UK
> Consultancy - Training - Hosting
> On Tue, 15 Aug 2006 11:08:58 +0100, Andrew A. Adams
> <a.a.adams_at_READING.AC.UK> wrote:
> >Regarding this note, one of the things we're struggling with in setting up a
> >pilot of an IR at the University of Reading (the School of Systems
> >Engineering and the School of Maths, Meteorology and Physics are jointly
> >piloting an IR for the Univrsity) is that of manually inputting local
> >institutional co-authors. It's one of the weaknesses, IMHO, of the GNU
> >eprints software that it doesn't have two methods of author input - selection
> >from a list of institutional users already registered, and free text input of
> >non-institutional authors. In fact, even with non-institutional authors, it's
> >quite common to regularly author joint papers with the same
> >non-co-institutional a number of times, if one has a productive external
> >collaboration. I would prefer, rather than manually entering each author name
> >in free text, to have a search system available for "registered authors" not
> >all of whom need to be registered users of the system (which deals with the
> >issue of people leaving institutions and stopping being registered users but
> >remaining as authors for their prior papers). If a new co-author is to be
> >entered, then minimising the number of keystrokes and the utility of having
> >more than just free-text name-entry only available, though not neceesarily
> >mandated, should be considered. As the IR grows then, if it is deemed useful,
> >people can be employed to add extra information onto the non-user author
> >details, such as affiliation at the time the paper was deposited, and
> >possibly cross-links to other IRs containing the works of that author (which
> >could also be useful for authors moving between institutions).
> >
> >
> >--
> >*E-mail********* Dr Andrew A Adams
> >**snail*27 Westerham Walk********** School of Systems Engineering
> >***mail*Reading RG2 0BA, UK******** The University of Reading
> >****Tel*+44-118-378-6997*********** Reading, United Kingdom
Received on Fri Aug 18 2006 - 19:00:22 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:28 GMT