Re: Rosch: Categorisation

From: Harnad, Stevan (harnad@cogsci.soton.ac.uk)
Date: Tue Dec 05 1995 - 17:47:13 GMT


> From: "Baden, Denise" <DB193@psy.soton.ac.uk>
> Date: Thu, 30 Nov 1995 17:00:56 GMT
>
> Rosch begins by making the point that human categorization should
> not be considered as arbitrary, eg as of the product of historical
> accident or whimsy. A lovely quote by Borges is given to demonstrate
> the sort of system that doesn't occur in the classification of
> animals: `a) those that belong to the emperor, b) embalmed ones, c)
> those that are trained.....'

What follows will be largely a critique of Rosch's idea (not of your
kid-brother precis of them, which is very good!). It's also not a
criticism of Borges, whose insights into categorisation are keen.

The suggestion that human categorisation should not be "considered"
arbitrary is somewhat problematic. The categories that there are in the
world for us to pick out and name and use in some way are infinite
(literally), and, yes, they include "natural" categories as well as
"arbitrary" ones. It's useful to pick out and name foods and people, but
if you are living in an authoritarian regime, it may also be useful to
pick out and name the kinds of things the dictator likes or commands,
and that could be arbitrary (prayers on Tuesdays and Thursdays, ribaldry
on Wednesdays and Fridays), and yes, "animals that belong to the
emperor").

"Embalmed animals" ARE a category, but instead of having a proper name
of its own, the category is named by a description. It is arbitrary when
the demands of discourse make it more efficient to take the shortcut of
giving a category a proper name instead of resorting to the longhand
description.

And, yes, there ARE categories that are just the product of historical
accident ("flying buttresses" in architecture, place names, etc.) or
whimsy (think of some).

The point is this: Rosch completely mixes up what (for want of a more
kid-brotherly word) we can call the metaphysical and the psychological
sides of categories. Metaphysically, there are "things" in the world,
concrete things, which physics, chemistry and biology study, and
abstract things, which mathematics and philosophy study. There are also
natural and artificial things (which engineers make and study).

We are psychologists. Neither the physics nor the metaphsyics of things
is our concern. Our concern is with people and animals, and HOW (and
only secondarily) what THEY categorise. I say "what" secondarily,
because in order to understand how people categorise apples, we need a
rough idea of what apples are, or at least what they appear like and
are thought to be by people. But beyond that, we are not concerned with
the physics, botany, or metaphysics of apples. We are not the ones
investigating what apples REALLY are. In that sense we are not
interested in what the members of the category "apple" really are,
forever and ever. Our task is merely to explain how people manage to
sort and label (= categorise) what they can and do sort and label as
they do.

And, yes, people DO sometimes sort and label things on the basis of
historical accident or whimsy, and not because they are following the
dictates of physical or metaphysical law.

There is at least one sense in which a category cannot be arbitrary,
however, or perhaps I should rather say that such arbitrary "categories"
do not have the same interest or implications that the regular ones do.
Here's an example of a regular "arbitrary" category: People that Sadam
Hussein dislikes on Tuesdays. That arbitrary category is a real one for a
psychologist, because under a tyrranical regime, people's lives might
depend on being able to sort out who does and does not belong to the
category of those whom Saddam dislikes on Tuesdays (for if you are
friendly with such a person on Tuesday, you may die).

Let's call such people that it is risky to associate with on Tuesdays in
Iraq "tuesads." Tuesads are an arbitrary category, but they are
otherwise exactly like poison mushrooms, the natural category you had
better be able to sort correctly or you die. The criterion, in both
cases, tuesads and toadstools (poison mushrooms), is that there are
CONSEQUENCES of MIScategorisation (sorting and labeling them wrongly):
Talk to a tuesad in Iraq on Tuesday, or eat a toadstool, and you die.
So in THAT sense, neither category is arbitrary: Categorise wrongly and
you'll see.

In fact, in general, where there is a right and wrong of the matter (in
terms of the consequences of sorting and labeling in one way rather than
another), then the categorisation is NOT arbitrary (even if the basis
for the consequences is arbitrary (i.e., Sadam's whim).

But what about the category "tuesday" FOR SADDAM? For us, ordinary
mortals, whoever Saddam takes a dislike to is poison. But for Saddam,
there are no consequences, there is no right or wrong of the matter: He
could toss a coin to decide whether or not he likes someone, and it
would be the same.

This last kind of arbitrariness is the kind I meant when I said that
there was one sense in which a category couldn't be arbitrary (or that
an arbitrary category is not a category). Again, we have to remember
that we are psychologists, concerned with explaining how people sort
and label what they do, as they do. To determine how they sort and
label, we have to see what they put into the category and what they
don't put into it. And we have to have a way of knowing when they are
right and when they are wrong. But Saddam is always right! No matter
what he puts in the category "tuesads," it's a tuesad. There are no
consequences of getting it wrong; there is no wrong.

"Categories" like that, that depend purely on the whim of ONE person,
and have no consequences, either for him or for anyone else, are
probably better described as subjective tastes rather than categories,
because they really are dissociated from the objects (tuesads and
non-tuesads) that they seem to be sorting and labeling. (This is
what Wittgenstein, for those of you who may know about him, SHOULD have
been arguing in his argument that there can be no private language -- a
language in which you simply decide for yourself what you are going to
call what, without any guidance from the consequences of getting it
wrong.)

I'll get back to this, but just wanted to set the distinctions I made in
your mind: (1) 2he difference between the metaphysical problem of categories
(what ARE things, really?), which is not the psychologist's problem, and
the psychological problem of categories (what do people sort and label,
and how?) and (2) the one kind of arbitrary category that is not really a
category: the kind for which there are no consequences of getting it
right or wrong.

> Two general principles are proposed for the formation of categories.
> First, cognitive economy - `to reduce the infinite differences among
> stimuli to behaviourally and cognitively usable proportions'.
> Second, it reflects the perceived world structure, as the objects in
> the world possess high correlational structure e.g. it is an
> empirical fact that wings co-occur more often with feathers than
> with fur.

You may by now already see why these "principles" are not satisfactory,
and miss the point (because they mix up the metaphysical and the
psychological problem of categories).

"Reducing stimulus differences to useable proportions" is a good start,
and sounds like it's addressing the pschological problem of
categorisation, but then what has the "high correlation" in the world
between wings and feathers got to do with it? The world is FULL of
correlations, just as it's full of features. Most of these are useless,
because they are irrelevant to anything that has consequences for us.
The only correlations that MATTER are the correlations with the
consequences of miscategorisation, of sorting and labeling wrongly:
They're the ones we have to learn (or be born able) to avoid.

Moreover, these "principles" don't give you any idea of why "cognitive
economy" would be useful! Here's a principle (which you should already
have gleaned from Luria, Funes, the Ugly Duckling Theorem, and Miller's
Magical number): It is not possible to pick out KINDS of things
(categories) without selecting some of their properties and ignoring
others. Otherwise everything is infinitely unique, and infinitely
useless. THAT's why you have to "reduce the infinite differences among
stimuli." And, yes, it's differences among STIMULI that we, as
psychologists, are talking about, not differences among the OBJECTS out
there, of which the stimuli are the "shadows" (although of course there
have to be correlations between the two if we are to make our way
successfully through the world).

So there aren't two category formation principles; there's one:
Categorisers must be able to detect the correlations between what they
see and what they do that allow them to perform the right action (output)
on the right KIND of object (stimulus, input). The world, not the
categoriser, dictates what's "right." The categoriser (if it's to
succeed and survive) must figure out how to do it right. If we simplify
the actions and just call them "labels" or names, The categoriser must
learn what kind of thing to call by what name.

> The distinction between vertical and horizontal levels of
> categorization is made. The vertical dimension concerns the level of
> inclusiveness of the category - the dimension along which the terms
> collie, dog, mammal, animal vary. The horizontal dimension concerns
> the segmentation of categories at the same level of inclusiveness -
> the dimension along which the terms car, dog, chair etc vary. Rosch
> argues that the use of prototypes, which contain the most
> representative attributes inside the category, would increase the
> flexibility and distinctiveness of categories along the horizontal
> dimension.

It is true that our categories form a hierarchy of abstraction, but
nothing much follows from this fact, because how we need to sort and
label, and re-sort and re-label, depends, as always, on the consequences
of getting it right or wrong. There are contexts where you have to say
"watch out for the dog!" because "watch out for the mammal" wouldn't
help (if you were surrounded by mammals, including people, and only the
dog was preparing to bite you). [Remember the example I gave you of
information and reducing uncertainty with the 6-alternative sandwich
machine.]

The entry-point into our hierarchy of categories is arbitrary (besides,
it's really a NETWORK of labels, rather than a strict hierarchy: Is
blue higher or lower than mammal? to the left or to the right of it?).

And there is no basic level! There's just a default context: You can
call Fido anu name from a Funes-like "Fido-at-3:15pm-frontal-view"
to Fido, to terrier to dog to carnivore to mammal to animal...
Which one you choose depends on the alternatives among which you need to
reduce uncertainty. "Dog" is just a default context; rather like saying
hello on the phone before you know it's someone to whom you would
say hi, or someone on whom you would hang up...

> Rosch did a series of experiments which showed the way in which these
> principles appeared to result in a basic level of categorization, as
> opposed to a superordinate or subordinate level (eg
> superordinate=furniture; basic=chair; subordinate=kitchen chair.

These levels are ARBITRARY. Every category name has higher and lower
levels of abstraction as alternatives. The default context of
alternatives is just determined by frequency and consequences: It is
different for a furniture salesman or a department store information
clerk.

> Basic level categories had more cue validity and category
> resemblance than other levels. Cue validity is a probabilistic
> concept; the validity of a given cue x as a predictor of a given
> category y. A category with high cue validity is, by definition,
> more differentiated from other categories than one with low cue
> validity. Category resemblance is defined as the weighted sum of the
> measures of all the common features within a category minus the sum
> of the measures of all the distinctive features. Thus superordinate
> categories have lower cue validity and category resemblance than
> basic level categories, because they have fewer common attributes.
> Subordinate categories have lower total cue validity than basic
> categories, because they share more attributes with other
> subordinate categories (eg kitcen chair & dining room chair)

It is again arbitrary which "cues" one singles out here. We know people
usually don't know what features they use to categorise. To find out
what features they use, one must actually investigate categorisation --
both its learning and its performance. For any categorisation that we
can do correctly and reliably, there will always have to be features
with high "cue validity." The rest is just a matter of what the
alternatives are, amongst which we need to reduce uncertainty (to avoid
the consequences of miscategorisation).

None of this cue calculus has yet resulted in a device that can actually
do nontrivial categorisation. Besides, as we all know by now (from the
Ugly Duckling Theorem), without a prior weighting of the cues, the
number of shared and unshared features is indeterminate. The name of
the game is finding which features are invariant and predictive and
which can be ignored.

> Rosch
> investigated 4 converging operational definitions of the basic level
> of abstraction: attributes in common, motor movements in common,
> objective similarity in shape, and identifiability of averaged
> shapes. She found that objects were first percieved at their basic
> level; that basic level words were the first to be acquired in
> infancy and that the basic level is linguistically the most useful.

Apart from the special case of inborn feature detectors, these
universals are just reflections of uniformities of experience or
what I've called default contexts of alternatives. Change the
alternatives and the consequences for the children and I bet you'll get
different "basic" categories.

> Prototypicality was found to be governed by the same principles such
> as maximization of cue validity and category resemblance as those
> principles governing the formation of the categories themselves.
> Rosch and Mervis (1975) thus found that the more prototypical of a
> category a member is rated, the more attributes it has in common
> with other members of a category and the fewer attributes in common
> with members of contrasting categories. This may be explained in 2
> ways. 1. that such structure is given by the correlated clusters of
> attributes in the real world. 2. the structure may be the result of
> the human tendency, once a contrast exists, to define attributes for
> contrasting categories so that the categories will be maximally
> distinctive.

Typicality judgment and categorisation are not the same task; typicality
already presupposes categorisation. It would have been interesting if
findings about typicality HAD given us a clue about categorisation, but
a closeness-to-prototype model works fine for continuous categories,
like "big" (everything is big to some degree) but not for the kind of
categories Rosch actually works with, such as "bird" (a fish is NOT an
atypical bird, merely less "birdy" than a robin!).

What HAS turned out to be successful in getting devices to categorise is
feature-learning -- but that's precisely the issue on which Rosch is
silent (or irrelevant). Of course, if in the end one merely redefines
a feature-detector as a "prototype" then everyone wins (but no one goes
home richer...).

> Rosch also presented evidence that prototypes of categories are
> related to the major dependent variables with which psychological
> processes are usually measured. These include things like speed of
> processing, where subjects decide whether x is a member of category
> y, speed of learning of artificial categories, order and probability
> of item output, i.e. when subjects list examples of a category.
> However prototypes do not specify representation and process models.
> For example, in pattern recognition, prototypes can be described as
> well by feature lists, structural descriptions or templates. Also
> prototypes can be represented by both propositional and image
> systems.

In other words, you can call anything a "prototype" (and think of it in
terms of your favorite mental image). The whole Roschian tradition began
as an attack on the "classical" view, according to which categorisation
was accomplished on the basis of a set of features that were necessary
and sufficient for catgory membership. We were told there was something
wrong with this view because:

(1) Subjects cannot tell give you a list of features on the basis of
which they categorise (and will often give you incomplete or incorrect
lists). [Wittgenstein was cited as an authority on this: No one can
define "game." Fine, but we can still say what is and isn't a game.
How do we do THAT if not by features?]

(2) Subjects find some category members more typical than others

(3) Subjects categorise and learn the typical members faster

(4) Wittgenstein said categories are not based on features but "family
resemblances"

So the classical view is thrown out and replaced by a prototype view
with membership based on closeness to a prototype. This works for
continuous "categories" like big, but not for categorical categories,
like "bird." So the prototype view begins to alter, talking again about
features, but not about "classical" features. It ends up with something
looking more and more like a classical featural view, but persistently
called a prototype...



This archive was generated by hypermail 2b30 : Tue Feb 13 2001 - 16:23:56 GMT