ON THE DEMYSTIFICATION OF MENTAL IMAGERY by Kosslyn
It has been argued that imagery cannot be the sole form of internal
representation, as an image cannot represent an object or scene without
some interpretative function that picks out the salient features. The
`stage directions' indicating which are the important features cannot
also be images, or we will get into the problem of infinite regress.
Kosslyn thinks, however that there is no reason why imagery cannot be
one form of representation in memory.
Kosslyn describes several experiments which undertake to determine
whether images act as functional representations which have real life
spatial characteristics, or whether they are an epiphenomenon.
Subjects were timed on how long it took them to scan a mental map.
The results suggested that images do represent metric distance and
that this property affects real-time processing of images. This
implies that images also have spatial boundaries, and this was also
tested, by seeing if subjects could image to the point of overflow.
Subjects' reports indicated that there was a high correlation between
the size of the imagined object and distance. It was also found that it
took longer for subjects to see properties on subjectively smaller
images. These results support the claim that our experienced images are
spatial entities and their spatial properties have real consequences
for some forms of information processing.
Kosslyn takes the results as supporting his CRT protomodel which
predicts that images are processed by the same sorts of classificatory
procedures that underlie normal perceptual processing. The CRT model
rests on the notion that visual images might be like displays produced
on a cathode ray tube by a computer programme operating on stored data.
Images are thus seen as temporary spatial displays in active memory
that are generated from more abstract representations in long term
memory, and are then interpreted and classified.
Kosslyn then goes on to examine the question of whether images are
retrieved in toto from memory, or whether they are constructed from
parts. It was found that larger and more detailed images took more time
to construct, which favours the 2nd view. I find these experimemts a
bit dubious, as they seem to view image construction simply as a visual
process, and ignore the posible underlying effects of internal talking.
For instance if I'm asked to imagine a cow, and then it is measured how
long it takes me to imagine its component bits, eg udder, eyes, tail
etc. I might choose to say my image is complete once I have repeated
the instructions to myself sotto vocco, rather than when a mental
picture has formed. Kosslyn et al do admit later that descriptive
factors also play a role in imaging, but do not seem to take this into
account when interpreting their earlier experiments.
Kosslyn then goes on to construct a computer simulation model that
reflects the properties of images that have been suggested from their
experimental evidence. The simulation contains a `surface matrix'
representing the image itself, and long term memory files which
represent the information used in generating images. The surface matrix
simulates 5 properties of imagery:
1. the image depicts information about spatial extent, also brightness,
2. the degree of activation decreases with distance from the centre, as
`overflow experiment' suggests that images fade at the periphery.
3. the surface has limited resolution - based on finding that smaller
images are more difficult to inspect.
4. the spatial image within which images occur is of limited extent,
and round or elliptical in shape.
5. the matrix corresponds to visual short term memory, and is subject to
fading - this arises from findings that complex images are more difficult to
Image generation uses 3 procedures: PICTURE, PUT & FIND which perform
the computations that generate the images. PICTURE takes specifications
of size, location & orientation. PUT integrates parts into an image.
FIND locates relevant parts of the image. Image classification: mainly
uses FIND, but may need to call on other procedures such as LOOKFOR
(SCAN,ZOOM,PAN,ROTATE) Image transformation: uses procedures above.
Kosslyn et al feel that the use of a computer simulation model enables
them to counter the objection that notions of mental imagery are vague
and logically incoherent.They list the advantages of using a computer
simulation model as follows: 1. it forces them to be explicit in their
assumptions 2. features of computational models correspond closely to
many features of cognitive models. 3. it shows whether the ideas are
sufficient in principle to account for the data. 4. enables predictions
to be made on the basis of complex interactions among components.
Whilst I do not argue against any of these advantages, I personally
feel that constructing models based on data and ideas arising from
computer simulations would tend to lead you into mistaken `imagery'
about how the brain works. This is because of the fundamental
differences between the two processing techniques i.e. a computer
processes in a serial fashion while human brains operate by parallel
Kosslyn goes on to counter some of the objections about the validity of
his results. These are well founded questions because the whole basis
of his theory is very shaky unless he can convice us that the subjects
reports accurately mirror what is going on in their brains.
One objection is the subjects try to give `right' answers by
fulfilling the experimenters expectations in their reports. Kosslyn
counters this rather weakly by pointing out instances of results that
ran counter to their own expectations. However the subjects themselves
may report based on their own expectations. For example, they know it
takes longer to scan a wide image in real life and so assume it would
be `correct' to take longer in image scanning. There are many
objections centred around this general theme, and Kosslyn reports that
they try to prevent these effects by questioning the subjects about
what they thought the purpose of the experiments were, and whether they
used particular strategies. It still seems to be an open question of
whether, or by how much, the image scanning efects are contaminated by
the demand characteristics of the experiment.
Kosslyn also counters Anderson's argument that many models could
account for the same data, eg one would not be able to distinguish a
propositional or an image based theory from the data.
Kosslyn suggests that a large mental image should take longer to rotate
as it will cover more `cells' than a small image, a prediction that
would not arise from a propositional theory. Kosslyn also takes on
Pylyshyns' challenge to determine which aspects of the imagery system
are `computationally primitive' i.e. cannot be affected by
expectations, intentions etc. They believe that the visual buffer is
primitive - could not allow cognitive penetration. They also believe,
due to parsimony considerations, that processes that detect patterns in
images (which they call the minds eye) will be the same as those that
detect patterns in the visual system. They do allow that image
transformation and image classification are not likely to be
Kosslyn et al are aware of the dificulties in their approach, and do
not claim to be attempting any more than a `protomodel'. They also
admit that the model is continuously being revised to fit the data, and
defend this by claiming that this is the essence of the model
constructing process. I am personally not too happy about this stance,
because, as I have already pointed out, the model is based on a
fundamentally different processing style to that of the human brain. I
also remain skeptical that, whether the millisecond time differences
between subject response times to various tasks, can be seen as
reflecting genuine differences in procedures, rather than individual
differences, expectancies etc.
This archive was generated by hypermail 2b30 : Tue Feb 13 2001 - 16:23:56 GMT