Harnad, S. (1993) Grounding Symbols in the Analog World with Neural Nets. Think 2: 12 - 78 (Special Issue on "Connectionism versus Symbolism" D.M.W. Powers & P.A. Flach, eds.). Pp. 44-47.
In this paper I offer an explanation of how the grounding of stimuli in an initial analog world can effect the interpretability of symbolic representations of the behaviour of neural networks performing cognition. I make two assertions about the form of networks powerful enough to perform cognition, first that they be composed of non-linear elements and second that their architecture is recurrent. As nets of this type are equivalent to non-linear dynamical systems I then go on to consider how the behaviour of such systems can be represented symbolically. The crucial feature of such representations is that they must be non-deterministic, they therefore differ from deterministic symbol systems such as Searle's Chinese Room. A whole range of non-deterministic symbol systems representing a single underlying continuous processes can be produced at different levels of detail. Symbols in these representations are not indivisible, if the contents of a symbol in one level of representation are known then the subsequent behaviour of that symbol system may be interpreted in terms of a more detailed representation in which non-determinism acts at a finer scale. Knowing the contents of symbols therefore effects our ability to interpret system behaviour. Symbols only have contents in a grounded system so these multiple levels of interpretation are only possible if stimuli are grounded in a finely detailed world.
What is the relationship between symbolic computation and that performed by connectionist networks? Stevan Harnad points out a few minor perceived differences between the capabilities of nets and symbol systems and then goes on to concentrate on solutions to Searle's 'Chinese Room' problem (Searle, 1980). I think he is correct in his argument (Harnad, 1990) that symbol systems which are not grounded in an initially analog world are indeed incomplete models of cognition. In this paper I will point out an important difference between deterministic symbol systems like the Chinese Room and neural networks which may explain formally why grounded and ungrounded systems are not equivalent.
I will start by making two assertions about the power of networks required to model cognition. First there is the well known requirement that nets must be composed of non-linear elements if they are to be capable of learning arbitrary classifications (Minsky and Pappert, 1969). The second requirement is that networks which are intended to model cognition must be capable of time-dependent information processing (see e.g. Edelman, 1978, Kentridge, 1990). If the units in these networks are not to have infinite memories of their prior states and time is not sliced up arbitrarily by an all-knowing external clock (it would have to be omniscient in order to predict the longest window needed to detect the determining antecedents of future events) then the architecture of these nets must be recurrent. If these assertions are accepted then we can consider networks capable of modelling cognition as non-linear dynamical systems from now on.
The behaviour of non-linear dynamical systems can be reconstructed in terms of symbolic languages defined by finite-state, stack or nested-stack automata (Crutchfield and Young, 1990) so a basis exists for directly comparing the symbolic computation of the Chinese Room with that of a neural net and hence of understanding symbol-grounding. When we consider how symbols are constructed when modelling the behaviour of a dynamical system symbolically, following the method of Crutchfield and Young, a fundamental difference between computation in the Chinese Room and computation in a neural network becomes apparent. When constructing a language from a stream of dynamical system states we wish to produce a minimal representation which consists of sets of states mapped onto discrete symbols together with rules specifying the allowable transitions between sequences of those symbols. In the process of producing a discrete symbolic description from a continuous stream of analog system states we must make an initial discrete partition of the data stream by dividing the state space of the dynamical system into a number of regions and then recording which of those regions the system is in at regular intervals. The aim of the language reconstruction is then to reduce long sequences of these discretely partitioned system states into individual symbols which nevertheless still predict the subsequent behaviour of the system at the symbolic level. If the stream of states was in fact produced by a formal language such as that of the Chinese Room then any sufficiently long sequence of states could be modelled by a reduced set of symbols and deterministic transition rules between those symbols, that is, by a deterministic grammar (see e.g. Aho, Sethi and Ullman (1986) for methods of achieving this). This is not the case when the stream of partitioned states is produced by a non-linear dynamical system. We can see, just by considering the initial stream of discretely partitioned states, that any sequence of states cannot be modelled deterministically by an automaton simpler than the sequence itself. The hallmark of non-linear dynamical systems is chaos. Chaos is defined by the sensitive dependence of a system's evolution on its initial states (Crutchfield, Farmer, Packard and Shaw, 1986). In other words, the eventual behaviour of a chaotic system is unpredictable unless its initial position in state space is known to infinite precision. When we partitioned the state space of our dynamical system into discrete regions the precision of the state space positions used in our language reconstruction necessarily became finite. The consequence of this is that any finite linguistic model of a non-linear dynamical system which admits chaotic behaviour cannot be completely deterministic - the rules governing the transitions between symbol sequences must be probabilistic if any ordered behaviour of the system is to be captured by a concise symbolic description.
I have described a deep difference between the deterministic symbolic computation of the Chinese Room and non-deterministic symbolic descriptions of neural networks' behaviour. How does this difference relate to the symbol-grounding problem? Consider what happens as we change the size of partitions in our initial discretization of the continuous analog stream of system states. We can produce a whole series of non-deterministic symbolic models at different scales of partition all of which describe the behaviour of the underlying dynamical system. As we decrease the size of our partition the predictability of subsequent system behaviour increases and hence the symbols and transition probabilities produced in our reconstruction change. Sequences of fine-grain states which correspond to symbols are collapsed or broken up in coarsely partitioned models. The transitions between symbols in a coarsely partitioned model are more precisely described in a finely partitioned model. If we know something about the set of system states from which an individual symbol is derived in the coarse model then our predictions of the subsequent symbolic behaviour of the system change. The internal structure of the symbol clearly contributes to system behaviour even at the symbolic level. The behaviour of a system in which the starting state (the sensory input) is analog (grounded) is, in theory, deterministic and in practical symbolic representations is better predicted if we know something of the contents of symbols. These multiple levels of symbolic interpretation are not possible in an ungrounded system in which symbols are indivisible.
One question still needs to be addressed: 'Are neural-networks which perform cognition likely to exhibit chaotic behaviour?' If the answer to this question was 'no' then the preceding argument about non-determinism and chaos would be irrelevant. Crutchfield and Young (1990) show, however, that before the onset of chaos the behaviour of non-linear dynamical systems can be modelled symbolically by finite-state automata, whereas their behaviour during the transition to chaos is only adequately modelled by nested-stack automata. In linguistic terms finite-state automata are equivalent to regular grammars (Chomsky, 1963) while nested-stack automata are equivalent to indexed grammars which include all context-free grammars and some context-sensitive ones (Aho, 1969). Regular grammars are inadequate to describe natural language descriptions of the world, some form of recursive grammar (context-free at the least) is required (Chomsky, 1963). We can conclude that a neural network powerful enough to perform the cognitive tasks required by the Chinese Room problem must therefore exhibit chaotic behaviour.
It is interesting to note finally that Crutchfield and Young also show that non-deterministic symbol systems achieve a maximum amount of computation in terms of complexity at the phase-transition between predictable and chaotic behaviour. I have recently (Kentridge, forthcoming) provided evidence suggesting that physiologically realistic neural network models are easily maintained at this phase-transition by diffuse background activity through a mechanism of self-organizing criticality (Bak, Tang and Wiesenfeld, 1988). It is therefore not implausible that cognition is achieved by the brain performing efficient non-deterministic symbolic computation at a phase-transition.
To return to the Chinese Room Problem, I propose the following explanation: A Chinese speaker in the real world can understand Chinese because he or she has access to a hierarchy of probabilistic symbolic interpretations of the world; Searle in the Chinese Room cannot because he only has access to a single level of symbols and an inadequate set of deterministic rules connecting them.
Aho, A.V. (1969) Nested stack automata. Journal of the Association for Computing Machinery 16: 383-406.
Aho, A.V.; Sethi, R. and Ullman, J.D. (1986) Compilers: Principles, Techniques and Tools. Reading, Mass.: Addison-Wesley.
Bak, P.; Tang, C. and Wiesenfeld, K. (1988) Self-organized criticality. Physical Review A 38: 364-374.
Chomsky, N. (1963) Formal properties of grammars. In R.D. Luce, R.B. Bush and E. Galanter (Eds.) Handbook of Mathematical Psychology, Volume 2. New York: John Wiley.
Crutchfield, J.P.; Farmer, J.D.; Packard, N.H. and Shaw, R.S. (1986) Chaos. Scientific American 254: 46-57.
Crutchfield, J.P. and Young, K. (1990) Computation at the onset of chaos. In W.H. Zurek (Ed.) Complexity, Entropy and the Physics of Information. (Santa Fe Institute Proceedings Volume VIII). Redwood City, CA.: Addison-Wesley.
Edelman, G.M. (1978) Group selection and phasic reentrant signaling: A theory of higher brain function. In G.M. Edelman and V.B. Mountcastle, The Mindful Brain: Cortical Organization and the Group-Selective Theory of Higher Brain Function. Cambridge, Mass.: MIT Press.
Harnad, S. (1990) The symbol grounding problem. Physica D 42: 335-346
Kentridge, R.W. (1990) Neural networks for learning in the real world: representation, reinforcement and dynamics. Parallel Computing 14: 405-414.
Kentridge, R.W. (forthcoming) Critical dynamics of neural networks with spatially localised connections. To appear in M. Oaksford and G. Brown (Eds.) Neurodynamics and Psychology. London: Academic Press.
Minsky, M.L. and Pappert, S. (1969) Perceptrons. Cambridge, Mass.: MIT Press.
Searle, J.R. (1980) Minds, brains and programs. Behavioral and Brain Sciences 3: 417-424.
This research was supported by DRA Fort Halstead, Contract 2051/047/RARDE. I would like to thank Rosemary Stevenson for helpful comments on an earlier draft of this paper.
There are several ways to construe Kentridge's friendly suggestions about chaos and cognition; some are indeed supportive, but some may be Trojan horses. It all boils down to which of the three rooms in the three-room argument his arguments apply to: If nonlinear dynamical systems that display chaos have essential analog properties (analogous to essential parallelism, as in room one, par), if a chaotic system is, like a transducer or a furnace, something a system has to be in order to display certain properties essential to cognition (in other words, if symbolically/numerically simulated chaos, "virtual" chaos, won't do), then all we need is the demonstration that such an essential property of chaos is indeed also a property essential to cognition. Transduction had face validity, but chaos requires an argument or a proof (I'm not sure I see either in Kentridge's commentary, but perhaps I have not understood it fully).
On the other hand, if all the peformance properties of chaos could also be exhibited by room 2 (sim), which would now be a symbolic/numerical simulation of the chaotic system in room 1 (i.e., a "nondeterministic symbolic description of the neural network's behavior," perhaps using numerical probabilities, multiplicative interactions, even pseudo-random number generators), then we would of course be back where we were in the beginning.
My own approach has the virtue of not stipulating anything about the innards of the winning system except that it must include sensorimotor transduction, which is of necessity analog. I can't tell whether Kentridge's proposal pertains only to properties of the innards (a nonlinear dynamical system with the capacity to exhibit chaos), or also to the properties of the analog input itself (in which case our positions are even more compatible). One still waits, of course, to see the full performance capacity of chaos (whether simulated or real): Can it, for example, help us pass the TTT?