Re: Open Research Metrics

From: Andrew McCallum <mccallum_at_CS.UMASS.EDU>
Date: Sun, 10 Dec 2006 08:30:11 -0500

Greetings! I've been lurking on this list for a few months, and
enjoying the messages. I have a lot to say about these topics, and I
think its high time I chime in!... in particular on the topic of
Research Metrics.

Brief self-introduction: I'm a professor of CS at UMass Amherst,
doing research in machine learning, NLP, and digital libraries. In
1998, while at CMU, I built Cora, an early contemporary of CiteSeer,
which used a lot of machine learning to do accurate metadata
extraction and entity resolution. In 2003 I received significant
funding from the NSF to research large-scale information extraction
and data mining, and create an enhanced alternative to CiteSeer and
Google Scholar. The result is a new system called "Rexa".

Rexa is a digital library covering the computer science research
literature and the people who create it. Rexa is a sibling to
CiteSeer, Google Scholar,, the ACM Portal. It's
chief enhancement is that Rexa knows about more first-class, de-
duplicated, cross-referenced object types: not only papers and their
citation links, but also people, grants, topics---and in the future
universities, conferences, journals, research communities, and more.

Many relevant publications are at

Rexa currently provides:
* Keyword search on over 7 million papers (mostly in computer science)
* Cross-linked pages for papers, authors, topics and NSF grants
* Browsing by citations, authors, co-authors, cited authors, citing
   (find who cites you most by clicking "Citing authors" on your home
* Web-2.0-style "tagging" to bookmark papers
* Automatically-gathered contact info and photos of author's faces
* Ability to send Rexa invitations to additional people of your choosing

Coming soon:
* Various bug fixes.
   (For example, I think that if you select "Remember my ID", login
will fail.)
* Much improved coverage of recent CS papers (it's fairly weak now)
* Ability to make corrections to extracted data
* Home pages for institutions and venues (already running in t the lab)
* Improved author coreference. Tough problem, on which we do much

Coming later:
* Improved extraction accuracy
* Much more data mining, topic analysis, trend analysis, etc.
* Broader coverage of more research fields

I'll close here and comment on Research Metrics in my next message.


Andrew McCallum          
Associate Professor      
Comp Sci Dept, UMass Amherst       413-545-1323 (w)
On Dec 9, 2006, at 6:37 AM, Stevan Harnad wrote:
> On Fri, 8 Dec 2006, Peter Suber wrote:
>>   If the metrics have a stronger OA connection, can you say something
>>   short (by email or on the blog) that I could quote for readers who
>>   aren't clued in, esp. readers outside the UK?
> Dear Peter,
> Sure (and I'll blog this too, hyperlinked):
> (1) In the UK (Research Assessment Exercise, RAE) and Australia
> (Research
> Quality Framework, RQF) all researchers and institutions are
> evaluated for
> "top-sliced" funding, over and above competitive research proposals.
> (2) Everywhere in the world, researchers and research institutions
> have
> research performance evaluations, on which careers/salaries,
> research funding
> and institutional/departmental ratings depend.
> (3) There is now a natural synergy growing between OA self-archiving,
> Institutional Repositories (IRs), OA self-archiving mandates, and the
> online "metrics" toward which both the RAE/RQF and research
> evaluation in
> general are moving.
> (4) Each institution's IR is the natural place from which to derive
> and
> display research performance indicators: publication counts, citation
> counts, download counts, and many new metrics, rich and diverse ones,
> that will be mined from the OA corpus, making research evaluation much
> more open, sensitive to diversity, adapted to each discipline,
> predictive,
> and equitable.
> (5) OA Self-Archiving not only allows performance indicators (metrics)
> to be collected and displayed, and new metrics to be developed, but OA
> also enhances metrics (research impact), both competitively (OA vs.
> NOA)
> and absolutely (Quality Advantage: OA benefits the best work the most,
> and Early Advantage), as well as making possible the data-mining of
> the
> OA corpus for research purposes. (Research Evaluation, Research
> Navigation, and Research Data-Mining are also very closely related.)
> (6) This powerful and promising synergy between Open Research and Open
> Metrics is hence also a strong incentive for institutional and funder
> OA mandates, which will in turn hasten 100% OA: Their connection needs
> to be made clear, and the message needs to be spread to researchers,
> their institutions, and their funders.
> Best wishes,
> Stevan
> PS Needless to say, closed, internal, non-displayed metrics are also
> feasible, where appropriate.
Received on Sun Dec 10 2006 - 18:22:41 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:39 GMT