- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>

Date: Sun, 16 Jan 2005 17:20:10 +0000 (GMT)

*> > [This recursive technique is analogous to Google's PageRank, hence could
*

*> > perhaps be called CiteRank; it is ironic that Google got the idea of
*

*> > PageRank from citation ranking, but then improved it, yet the improvement
*

*> > has not yet percolated back to citation ranking, because ISI had no
*

*> > particular motive to implement it -- perhaps even a disincentive, as it
*

*> > might reduce the journal impact factor of the large, average journals
*

*> > which are of necessity ISI's numerical mainstay!]
*

*>
*

*> Dear Stevan,
*

*>
*

*> You are right that this is long overdue at ISI, but the idea predates
*

*> Google or citation ranking. In my own field this kind of eigenvector
*

*> measure was first proposed by Katz and by Hubble in the 1950s. The
*

*> canonical reference, which includes all the math worked out in some detail,
*

*> is L. Katz, "A new status index derived from sociometric analysis",
*

*> Psychometrika 18, 39-43 (1953).
*

*>
*

*> You're the expert in this field and I'm definitely not, but I'm sure I have
*

*> been told that that some citation indexing services (like Citeseer)
*

*> implement ranking mechanisms like this, so it would appear your first wish
*

*> is already granted -- people have indeed decided not to wait for ISI.
*

I'm not the expert, but I can certainly ask Lee Giles whether citeseer is based on

unweighted citation counts or recursive citation weighting (Lee?).

*> There are however technical problems with the measure as you describe it:
*

*> if my paper is only cited by those with zero "citerank" then I have zero
*

*> citerank too.
*

I should think the citer's citerank could be initialized either with

epsilon, or with some function of total overall citations, so all citations

are nonzero and weights are somehow normalized (either locally, globally, or

within a relatively closed citation circle).

*> But the papers citing me have zero citerank if they are
*

*> cited by still others with zero citerank, and so forth. What this means in
*

*> practice is that if you actually do the calculation you are proposing,
*

*> every paper will get a citerank of zero and the measure is useless. (In
*

*> mathematical terms, since the citation network is (roughly) acyclic, it has
*

*> a nilpotent adjacency matrix.) This seems to be the main reason why such a
*

*> scheme has not been widely implemented yet -- you have to do something more
*

*> sophisticated to actually get an answer. The crudest solution is to add a
*

*> constant term to the equation, as pagerank does, so that every paper gets a
*

*> small "free" amount of citerank, regardless of whether it is ever cited,
*

*> but this is a somewhat artificial solution. There may be better things to
*

*> be done, such as Jon Kleinberg's hubs-and-authorities method that you
*

*> mention.
*

I am sure there are optimal ways to intialize or bound such a recursion. I

am not expert enough technically to do more than intuit the way, but it

is obvious that it can be done!

Stevan Harnad

Received on Sun Jan 16 2005 - 17:20:10 GMT

Date: Sun, 16 Jan 2005 17:20:10 +0000 (GMT)

I'm not the expert, but I can certainly ask Lee Giles whether citeseer is based on

unweighted citation counts or recursive citation weighting (Lee?).

I should think the citer's citerank could be initialized either with

epsilon, or with some function of total overall citations, so all citations

are nonzero and weights are somehow normalized (either locally, globally, or

within a relatively closed citation circle).

I am sure there are optimal ways to intialize or bound such a recursion. I

am not expert enough technically to do more than intuit the way, but it

is obvious that it can be done!

Stevan Harnad

Received on Sun Jan 16 2005 - 17:20:10 GMT

*
This archive was generated by hypermail 2.3.0
: Fri Dec 10 2010 - 19:47:45 GMT
*