amitabha mukerjee


department of computer science and engineering
indian institute of technology, kanpur
kanpur- 208 016, india.
amit [at]

Research highlights: the reflective baby

The primary interest in our research group is in developmental cognition - how an infant (or for that matter, a robot) - might acquire concept-like structures. Our primary hypothesis is that by doing similar things repeatedly and noting which solutions seem to give good results, one can internalize certain correlations between parameters, which reduce the degrees of freedom in the decision space. In our work, we invoke this paradigm, which reduces dimensions by restricting the search to a manifold in the decision space, in a diverse range of problems: concepts of containment, visual motion planning, associating words, metaphorical transfer, learning mechanics, and the acquisition of syntax. The term "reflective baby" draws on Donald Schon's coinage - "reflective Practitioner", except that the reflection may be subconscious.

Expertise is known to involve the acquisition of {\em chunks}, which are compact representations of the input that preserve what is functionally salient. The process is often implicit: performing a task repeatedly, we fuse those aspects of the input that are correlated, so that solutions lie along some low-dimensional surface in the input space. In our work in this group, we propose a computational model for discovering such manifolds in high-dimensional sensorimotor space. For example, good designs often lie on low-dimensional subspaces. Next, we consider the infant as an optimizer, and show how such manifolds may emerge in the baby's sensorimotor space. A particular focus of this work is relating concepts that are acquired using sensory data to to language (both at the word learning and syntax levels) . This involves using computer vision to discovering patterns in the input, robotics to explore feasible actions in that space, and natural language processing to map these to language.

the symbol for "in". At its semantic pole is an image schema (a generative classifier) - a function with two arguments based on a distribution on the visual angle. it has learned the association of this schema with the linguistic unit "in" from co-occurring language.

Spatial structure discovery

Consider spatial relations - e.g. "containment" - where an object is contained or enclosed by something A reference such as
the shelf in the bedroom
is a direct spatial usage of the term, where the shelf is the trajector (tr) that is contained in the bedroom or landmark (lm). Babies show sensitivity to containment 2 months onwards, and this this relationship generalizes (becomes more schematized or abstract) by six months, when they can distinguish the relationship independent of the participating objects.

Subsequently, with increasing sensitiivty to the sounds of language and an ability to parse word boundaries (around 10-14 months), infants begin to demonstrate become sensitive to one of the perceptual signatures available to an infant is the visual angle subtended by an object on the retina. if one clusters the visual angles for various landmarks, we find that a stable pattern arises when it becomes 360 degrees - when we are fully inside a space like a room, the angle doesn't change with our local motions. We show that computationally such a pattern is discovered naturally as a stable cluster. We argue that such a cluster may provide an initial characterization of containment.

Mapping to language

However, mapping such a perceptual structure to language is problematic. Different languages carve up the space of spatial relations in different ways. Also, with increasing exposure to language, the mental model of a lexical item (the image schema) itself changes. One of the objectives of this work is to trace this ontogenetic evolution in the image schema of containment relations.

In this work, we demonstrate the process via a computational simulation based on a simple video. We consider how an early learner may acquire the perceptual notion of full containment and how she may learn to map it for different languages.

Initial image schema: Let's say the agent is considering two-object interactions, - a trajector (tr) and landmark (lm). By observing the distribution of the visual angle presented by the lm at the tr, we discover that if the situation is one of full containment, then the visual angle of lm at tr is very close to 360 degrees. This may be thought of as a computational model of an initial image schema for containment.

Symbolic unit discovery: The system has a fragile perceptual schema, it now needs to discover what to call this. For this, it considers co-occurring commentaries by a number of adults. All lexical items uttered in sentences co-occurring with containment situations are candidates for association. We use a mutual information measure for association, and discover that in, into, inside are the words with strongest association with full containment. Thus we learn a symbol (in the sense of cognitive grammar) with both phonological and semantic poles.

The discovered symbol has two arguments slots for {tr, lm}, and a function that discriminates if a given spatial situation between them is [IN] or not. On the other hand, when hearing the word "in", a visualization can also be generated by this schema, by imagining a landmark which has a visual angle.

phrase structure discovered from untagged corpus, using the [Solan/Edelman:2002] algorithm ADIOS.

symbolic composition: symbolization for "in the box", with one argument slot free, for the trajector

Discovering syntactic constructions (Symbolic composition): Subsequently, the system starts to look at the patterns of tokens that appear in the same commentary. Several construction are found to frequently co-occur with containment situations, e.g. the {circle|big square} moves into the box. These constructs are then associated with containment.

We can now recognize synonyms using this construction, e.g. someone says "the big block goes into the box" - by comparing with the visual image, we can recognize the "big block" as a synonym for "big square". At this stage, some pronomial anaphora are also discovered as polysemies ("it", "them", "each other").

Meaning enrichment and metaphor: Once such language constructions are known, we can identify these in novel text, without co-occurring perceptual input. This is how most words are learned.

E.g. the construction "in the X", has X as the container. From the large Brown corpus, we now find a number of tokens appearing in this position - this tells us that these tokens are acting as a container. However, by looking at the object classes these tokens are coming from, we can identify that these are not direct spatial containment. Hence, we gradually extend the meaning to include conventionalized metaphorical extensions; the semantic pole is enriched with containers such as time ("in the nineties") or group ("in the team").

Building artificial agents

This work has two ramifications for building AI systems. First, scaling up to human-like capabilities may not be possible by hand-coding the knowledge: such unsupervised approaches for learning schemas may be is more scalable, though they require many many such situations with co-occurring language. One of our goals should be to develop large corpora with such data.

An associated problem I am looking at involves discovering design symbols based on exploring design spaces. Here good solutions lie on manifolds since many design parameters are inter-related (as the strength increases, both width and height may go up). This is fundamental to acquiring the tacit knowledge that underlies many types of design decisions.

Select Publications

Visuo-motor learning on manifolds (path)

Semantically-driven language acquisition (containment)

Dynamic attention models

Theory of Mind

Word learning in 3D scenes

Cognitive models in design


Other interests: Hands-on learning in schools


publications   students   storytelling science  literary  iitk birds  book excerptise 


cse home page    center for robotics    iitk home   

Update: Oct 2009