amitabha mukerjee

 

professor
department of computer science and engineering
indian institute of technology, kanpur
kanpur- 208 016, india.
amit [at] iitk.ac.in
 


Research

I work at the intersection of Computer Vision and Natural Language. I am particularly interested in the process by which a perceptual system with models for similarity and attention may acquire perceptual-schemas for actions and eventually associate it with perceptual units of speech (linguistic labels). Following recent work in psychology, we propose models for how infants may be acquiring concepts of objects, actions and relations from perception. Concepts of relations and actions involve arguments, and the constraints among these may lead to constraints in syntax.

The work has two ramifications for building AI systems. First, scaling up to human-like capabilities may not be possible by hand-coding the knowledge; learning such models as schemas based on extended perceptual-motor interactions appears to be a better approach.

h-symbol: human symbols attach a meaning to each label f-symbol: formal computer symbols are just the labels
Secondly, the very structure of computation as we understand it today, where empty symbols are combined according to some "grammar", appears unlikely to support aspects of language such as ambiguity, metaphor, indexicals, etc.. A formal "symbol" is an empty token in an alphabet, whereas symbols in everyday use stand for something. Instead of empty f-symbols, human usage involves the experientially grounded h-symbols. We propose that abstractions from the perceptual-motor space, called image schemas, may constitute the grounding for such symbols. Similarity between schemas make the symbols elastic. Computation at the semantic level, with h-symbols, provides solutions to many problems in AI, such as defeasible reasoning and the frame problem.

An associated problem I am looking at involves discovering design symbols based on exploring design spaces, and models for learning design expertise.

Select Publications: Attention and Symbol Emergence

  • G. Satish, Amitabha Mukerjee
    Acquiring linguistic argument structure from multimodal input using attentive focus,
    7th IEEE International Conference on In Development and Learning, 2008. ICDL 2008. p. 43-48.  
    [pdf]

    Unsupervised temporal clustering is used to discover spatial activity from a 2D video. Perceptual attention restricts search to objects attended to sequentially. Learned temporal templates constitute a model for each activity. The fact that chase takes two arguments, or that these are commutative, are discovered in perception. Next, these actions are associated to words from a commentary based on maximum likelihood, without any knowledge of grammar. Thus the semantics of actions may inform aspects of grammar such as argument structure or word order. Further, many frame axioms may be inferred directly from such perceptual models.

  • Amitabha Mukerjee,
    Using attentive focus to discover action ontologies from perception
    Fifth International Workshop on Neural-Symbolic Learning and Reasoning NeSy09, Jul 11, 2009   [pdf]

    By changing the granularity of the classification in the above process, one may be able to learn that move-away-A-fixed, move-away-B-fixed, and move-away-both-simultaneously are different types of move-away.

  • Amitabha Mukerjee, and Mausoom Sarkar
    Perceptual Theory of Mind: An intermediary between visual salience and Noun / Verb Acquisition
    International Conference on Developmental Learning ICDL-06, Bloomington, Indiana, May 31-June 3, 2006   [pdf]

    In using attention to learn from a micro-world, it is necessary to assume that certain aspects of the scene that are salient to an adult speaker are also salient for the learning organism (e.g. a human infant). This assumption leads to the ability to map words corresponding to nouns and also verbs, from unparsed linguistic commentary.

    Frame from complex 3D video. Here foreground object blobs are first clustered based on appearance. These noisy clusters are associated with words from a commentary. Nouns for four of the ten object classes can be learned. Also, action descriptors such as "moving from left to right" can be learned. [Guha/Mukerjee 2008]
  • Prithwijit Guha, Amitabha Mukerjee
    Language Label Learning for Visual Concepts Discovered from Video Sequences
    5th Workshop on Attention in Cognitive Systems, Springer LNCS, ed. Lucas Paletta and Erich Rome, 2008, p. 81-94   [pdf]

  • Vivek Kumar Singh, Subhransu Maji and Amitabha Mukerjee
    Confidence-based updation of Motion Conspicuity in Dynamic Scenes
    Third Canadian Conference on Computer and Robot Vision 2006 CRV-06, Quebec City.
    [pdf]

  • Amitabha Mukerjee, Madan M. Dabbeeru
    The birth of symbols in design
    21th International Conference on Design Theory and Methodology, San Diego August 31-Sept 2, 2009.   [pdf]

    Solutions to hard problems may lie in small regions of the possible space. These regions can be characterized using much lower dimensions than used in the original problem formulation. These lower dimensional mappings may correspond to "chunks" which have been related to the development of expertise in areas such as design. Eventually, some of these "chunks" may map to "symbols" that associate a label with a meaning.
    More: The "infant designer" enterprise

     

    Other interests: Hands-on learning in schools

     


    publications   students   storytelling science  literary  iitk birds  book excerptise 

     

    cse home page    center for robotics    iitk home   

    Update: Oct 2009