amitabha mukerjee
professor
department of computer
science and engineering
indian institute of technology, kanpur
kanpur- 208 016, india.
amit [at] iitk.ac.in
Research
I work at the intersection of Computer Vision and Natural Language. I am
particularly interested in the process by which a perceptual system with
models for similarity and attention may acquire perceptual-schemas for
actions and eventually associate it with perceptual units of speech
(linguistic labels). Following recent work in psychology, we
propose models for how infants may be acquiring concepts of objects, actions
and relations from perception. Concepts of relations and actions involve
arguments, and the constraints among these may lead to
constraints in syntax.
The work has two ramifications for building AI systems. First, scaling up to
human-like capabilities may not be possible by hand-coding the knowledge;
learning such models as schemas based on extended perceptual-motor interactions
appears to be a better approach.
 |
|
| h-symbol: human symbols attach a
meaning to each label |
f-symbol: formal computer symbols are just the labels |
Secondly, the very structure of computation
as we understand it today, where empty symbols are combined according to some
"grammar", appears unlikely to support aspects of language such as ambiguity,
metaphor, indexicals, etc.. A formal "symbol" is an empty token in an
alphabet, whereas symbols in everyday use stand for something.
Instead of empty f-symbols, human usage involves the experientially grounded
h-symbols. We propose that abstractions from the perceptual-motor
space, called image schemas, may constitute the grounding for such
symbols. Similarity between schemas make the symbols elastic. Computation
at the semantic level, with h-symbols, provides solutions to many problems in
AI, such as defeasible reasoning and the frame problem.
An associated problem I am looking at involves discovering design symbols
based on exploring design spaces, and models for learning design
expertise.
Select Publications: Attention and Symbol Emergence
G. Satish, Amitabha Mukerjee
Acquiring linguistic argument structure from
multimodal input using attentive focus,
7th IEEE International Conference on
In Development and Learning, 2008. ICDL 2008. p. 43-48.
[pdf]
Unsupervised temporal clustering is used to discover spatial
activity from a 2D video. Perceptual attention
restricts search to objects attended to sequentially. Learned
temporal templates constitute a model for each activity. The fact
that chase takes two arguments, or that these are commutative,
are discovered in perception. Next,
these actions are associated to words from a commentary based on maximum
likelihood, without any knowledge of grammar. Thus the semantics of
actions may inform aspects of grammar such as argument structure
or word order. Further, many frame axioms may
be inferred directly from such perceptual models.
Amitabha Mukerjee,
Using attentive focus to discover action ontologies from
perception
Fifth International Workshop on Neural-Symbolic Learning and
Reasoning NeSy09, Jul 11, 2009
[pdf]
By changing the granularity of the classification in the
above process, one may be able to learn that move-away-A-fixed,
move-away-B-fixed, and move-away-both-simultaneously are different types of
move-away.
Amitabha Mukerjee, and Mausoom Sarkar
Perceptual Theory of Mind: An intermediary between visual salience and
Noun / Verb Acquisition
International Conference on Developmental Learning ICDL-06, Bloomington, Indiana,
May 31-June 3, 2006
[pdf]
In using attention to learn from a micro-world, it is
necessary to assume that certain aspects of the scene that are salient to
an adult speaker are also salient for the learning organism (e.g. a human
infant). This assumption leads to the ability to map words corresponding
to nouns and also verbs, from unparsed linguistic commentary.
|
| Frame from complex 3D video. Here foreground object blobs are first
clustered based on appearance. These noisy clusters are associated with
words from a commentary. Nouns for four of the ten object classes can be
learned. Also, action descriptors such as "moving from left to right" can be
learned. [Guha/Mukerjee
2008] |
Prithwijit Guha, Amitabha Mukerjee
Language Label Learning for Visual Concepts Discovered from Video
Sequences
5th Workshop on Attention in Cognitive Systems,
Springer LNCS, ed. Lucas Paletta and Erich Rome, 2008, p. 81-94
[pdf]
Vivek Kumar Singh, Subhransu Maji and Amitabha Mukerjee
Confidence-based updation of Motion Conspicuity in Dynamic Scenes
Third Canadian Conference on Computer and Robot Vision 2006 CRV-06, Quebec City.
[pdf]
Amitabha Mukerjee, Madan M. Dabbeeru
The birth of symbols in design
21th International Conference on Design Theory and Methodology, San Diego August 31-Sept 2, 2009.
[pdf]
Solutions to hard problems may lie in small regions of
the possible space. These regions can be characterized using much lower
dimensions than used in the original problem formulation. These lower
dimensional mappings may correspond to "chunks" which have been related to the
development of expertise in areas such as design. Eventually, some of
these "chunks" may map to "symbols" that associate a label with a meaning.
More: The "infant designer"
enterprise
Other interests: Hands-on learning in schools
Update: Oct 2009