[SIGMETRICS] Continuous multi-metric research assessment

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Fri, 16 Nov 2007 14:17:09 +0000

Adminstrative info for SIGMETRICS (for example unsubscribe):
http://web.utk.edu/~gwhitney/sigmetrics.html

On Fri, 16 Nov 2007, Jonathan Adams (Director, Evidence Ltd) wrote:

> Stevan
> Following on from your comments at the end of last week (below) I agree
> that it is possible tentatively to pick out 'over-production' of poor
> quality papers (although I am less optimistic about the comprehensive
> analytical detection of publication abuse you foresee).

Jonathan,

I think you may be greatly underestimating (1) the power of multivariate
(as opposed to univariate) analysis, validation and weighting as well
as (2) the power of open access (i.e., online, public, pervasive,
continuous, and dynamic) metrics.

You get a completely different sense of what is possible, and how, if
you think in terms of:

    (i) individual, isolated metrics, assessed at long intervals under
    closed scrutiny (like the current RAEs)

or if you think instead in terms of:

    (ii) a large (possibly growing) battery of candidate metrics,
    assessed jointly and continuously rather than at long intervals,
    with the contribution of each metric to their joint predictive power
    initially validated against existing criteria that have been relied on
    before (such as the RAE panel rankings) and then updated dynamically,
    field by field, by adjusting the weights on each component metric
    -- and always under open scrutiny.

Not only can "overproduction" of lightweight papers be detected and
weighted by simply profiling on the joint relation between (say) the
article count, the article citation count, the journal citation average
("impact factor") and the journal download count -- but so can other
anomalous or abusive profiles be detected, exposed, and penalized
and discouraged through weighting.

> By contrast to over-production, do you think that an assessment system
> that looks at total output would run the risk of suppressing outputs
> that might be predicted to be cited less frequently?

Not unless it is decided (for some unknown a-priori reason!) that a
profile consisting of N highly cited papers plus M less cited papers is
to be given a lower weight that a profile consisting of N highly cited
papers plus 0 less cited papers!

> UK research assessment currently looks at four outputs per researcher,
> usually selected by the individual as their best research.

That, of course, was a foolish, arbitrary constraint all along: It
was (well-meaningly) intended to minimise both salami-slicing and the
number of papers the panel would have to read. But of course continuous
OA metrics solve both problems, as they can detect and weight the
salami-slicing profile, and panel-reading (after the validation phase)
is no longer a factor, except as a periodical higher-level check on
the continuous, dynamic weightings and profiles. (So let all papers
be considered, continuously, and let 1000 metrics bloom, under open
peer scrutiny, and panel monitoring and weight calibration!)

> The proposal is that post-2008 the metrics assessment would be of all
> output, creating a profile and then deriving a metric derived from that.

"A" metric? Or a battery of metrics? (The "h-index" and its ilk are all
examples of a-priori, unvalidated, fixed, 1-number metrics; what is
needed is a rich multiple regression equation, with adjustable weights,
validated initially against the 2001 and 2008 RAE panel rankings. You
can add prewired metrics like the h-index to the battery, but don't use
them *instead* of a weighted, multimetric battery.)

> Is there a risk that researchers, realizing that outputs aimed at
> practitioners often appear in relatively lower impact journals, would
> then tend to reduce the number of papers they produced aimed at
> transferring knowledge from the research base and concentrate on outputs
> targeted at high-impact journals in the research-base core? They
> would expect by doing so to avoid dilution of their citation average.

This would be faulty reasoning on the part of researchers, if there were
a continuous, multi-metric equation in place, with its weights being
dynamically updated under peer scrutiny to detect and weight exactly
this sort of practise!

If applications are valued in a field, add application metrics: Are
certain journals more applications oriented? Crank up their weight! Is
it better to partition citations into basic vs. applied journals, with
differential weights for citations in the one and the other in certain
fields? Do so. Don't just think of a univariate measure (citations,
or h-index) and how authors might bias that measure by altering the kind
of journals they publish in, or the number of articles they submit for
assessment! Think multivariately, dynamically, and openly.

New applications metrics, besides journal types, might include
downloads, or even (if possible) industrial IP downloads; patents are
also metrics. Depending on the field, there will no doubt be other
measurable, monitorable performance indicators for applications impact
(and for teaching impact too!).

It's not all about ways to bias one single citation metric, but about
developing richer metrics. If the worry is about encouraging technology
transfer and applications flow, find objective measures of it and plug it
into the equation. Don't treat it as just a default bias, to be minimized
by cutting down on metrics!

> The net effect could be to reduce the UK's volume of less frequently
> cited papers, but also to reduce information flow to the people who
> turn research into practice.

This is again univariate thinking. Yes, citation counts are important,
but there are citations and citations. Basic citations, applied
citations. Basic publications, applied publications. Not only do fields
have to compare like with like, but their preferred blends can be
weighted and rewarded accordingly.

(This, by the way, is not "biasing", any more than mandating and rewarding
publication itself is biasing: it is providing incentives for the kind
of research performance we want, and that we want to reward. Continuous
multivariate OA metrics allow preferred profiles to be rewarded and
encouraged dynamically. Cheater detection allows self-citations, robotic
or anomalous download inflation, salami-slicing, etc. to be detected,
exposed and penalized. Metrics are not ends in themselves, they are
merely objective performance correlates. They are easy to abuse singly,
but much harder to abuse jointly, and in the open.)

The UK's RAE is unique; so is its new conversion to metrics. The UK is
hence leading the world in research metrics. Don't think cravenly in
terms of how the UK will stack up in terms of existing, unvalidated,
univariate metrics. Think in terms of establishing metric standards for
the entire world research community in the metric OA era!

    Harnad, S., Carr, L., Brody, T. & Oppenheim, C. (2003)
    Mandated online RAE CVs Linked to University Eprint
    Archives: Improving the UK Research Assessment Exercise
    whilst making it cheaper and easier. Ariadne 35.
    http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm

    Shadbolt, N., Brody, T., Carr, L. and Harnad, S. (2006) The Open
    Research Web: A Preview of the Optimal and the Inevitable, in Jacobs,
    N., Eds. Open Access: Key Strategic, Technical and Economic Aspects,
    chapter 21. Chandos. http://eprints.ecs.soton.ac.uk/12453/

    Harnad, S. (2007) Open Access Scientometrics and the UK Research
    Assessment Exercise. In Proceedings of 11th Annual Meeting of the
    International Society for Scientometrics and Informetrics 11(1), pp.
    27-33, Madrid, Spain. Torres-Salinas, D. and Moed, H. F., Eds.
    http://eprints.ecs.soton.ac.uk/13804/

    Brody, T., Carr, L., Gingras, Y., Hajjem, C., Harnad, S. and
    Swan, A. (2007) Incentivizing the Open Access Research Web:
    Publication-Archiving, Data-Archiving and Scientometrics. CTWatch
    Quarterly 3(3). http://eprints.ecs.soton.ac.uk/14418/

Stevan Harnad

> Jonathan Adams
>
> Director, Evidence Ltd
> + 44 113 384 5680
>
> Comment on: "Bibliometrics could distort research assessment"
> Guardian Education, Friday 9 November 2007
> http://education.guardian.co.uk/RAE/story/0,,2207678,00.html
>
> Yes, any system (including democracy, health care, welfare, taxation,
> market economics, justice, education and the Internet) can be abused.
> But
> abuses can be detected, exposed and punished, and this is especially
> true in the case of scholarly/scientific research, where "peer review"
> does not stop with publication, but continues for as long as research
> findings are read and used. And it's truer still if it is all online and
> openly accessible.
>
> The researcher who thinks his research impact can be spuriously enhanced
> by producing many small, "salami-sliced" publications instead of fewer
> substantial ones will stand out against peers who publish fewer, more
> substantial papers. Paper lengths and numbers are metrics too, hence
> they too can be part of the metric equation. And if most or all peers do
> salami-slicing, then it becomes a scale factor that can be factored out
> (and the metric equation and its payoffs can be adjusted to discourage
> it).
>
> Citations inflated by self-citations or co-author group citations can
> also be detected and weighted accordingly. Robotically inflated download
> metrics are also detectable, nameable and shameable. Plagiarism is
> detectable too, when all full-text content is accessible online.
>
> The important thing is to get all these publications as well as their
> metrics out in the open for scrutiny by making them Open Access. Then
> peer and public scrutiny -- plus the analytic power of the algorithms
> and the Internet -- can collaborate to keep them honest.
>
Received on Fri Nov 16 2007 - 15:00:19 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:06 GMT