PROJECT PROPOSAL

COURSE-ME768
ARTIFICIAL INTELLIGENCE IN ENGINEERING
INSTRUCTOR: Dr. AMITABHA MUKHERJEE

ARTICULATED AGENT MOTIONS BASED ON NL INPUT

Kesari Anandsudhakar
Rajesh Rajasekar
Vikrant Kumar
email: { ands, rajraj, vikrantk }@iitk.ac.in

Motivation

The Virtual Director project at IIT Kanpur attempts to build a virtual environment and produce actions in it according to a natural language story. This project fits into the big picture beyond the stage where natural language has been parsed for grammar and semantic information, and the spatial data defuzzified. At this point, the input would be in the form of spatial descriptors and action directives which are at a level of complexity below natural language but higher than the most primitive of directives.

Objectives:



Past Work

Work has been done in this field at the Indian Institute of Technology, Kanpur by Dr. Amitabha Mukherjee as a part of the Virtual Director project.

The aspects of anthropometrics and modelling are discussed by Norman Badler and Stephen Smoliar [Badler/Smoliar:1979]. Physical aspects such as the dynamics of complex objects are studied by Hoffman and Hopcroft [Hoffman/Hopcroft:1987]. This will be used to perform character animation.

Norman Badler has developed Jack, a basic human body animation system [Badler]. An extension of the basic Jack systm by Levison et.al. [Geib/Libby/Moore:1994] is SodaJack which simulates a soda bar operator. Action planning in the context of search and manipulation of objects is discussed.

Salesin et.al have created a precise language for the otherwise fuzzy process of defining shots and other basic elemnts of cinematography [Salesin/Cohen/Christianson:96]. Camera control to amplify the impact of the story can be based on this convention.

Lozano and Lozano-Perez explain the method of using visibility graphs to perform path planning in a domain with polygonal obstacles [Lozano/Lozano-Perez:1996].



Sample Input-Output

Sample outputs from the Path Planning Module:

The results of motion planning is shown in the following images. In the actual project the input is taken from a file which has the co-ordinates of all the obstacles. The output will be another file which contains the co-ordinates of the path. The images shown below were plotted for better understanding.


The above image shows the obstacles in the envirnonment. There are totally six in number.


This image shows the obstacle space calculated from the obstacles. In the actual case the corners of each obstacle space will be rounded but we have approximated it to be sharp. This approximation facilitates calculation of visibility graph.


This image shows the path(in dark green colour) between start and goal. Start is at (0,0) and goal is at (28.5,45). The path is shown in actual environment.


In this image the same path is shown in configuration space.


Here motion planning is done in the same obstacle space but now goal is changed to(16.5,25). In this image, path obtained is not smoothed.


This image shows the effect of smoothing on the path obtained above. The snake is applied iteratively five times on the path obtained and result is shown above. We believe that a better smoothing can be obtained by varying the coefficients of the external and internal energy functions and the number of times the snake to be applied on the path.

The visible decomposition technique was tested in quite a few obstacle spaces and it gave successfull results. Only the energy fuction of snakes needs to be fine tuned for obtaining better smoothened paths.

The input to the program will be a description of the scene followed by a transcript of the interactions between the various components of the scene.


Man Jack
Book HoggCraig
Table t
##
Jack at Hall 3 10
t at Hall 7 4
HoggCraig on t
##
Jack look HoggCraig
Jack pick HoggCraig
Jack read HoggCraig
Input to Camera Control Module:
look
10   3  12  4   //direction of gazing
45              // no of frame
end

path     //the pick is interpolated into a goto and a pick.
1.5      //velocity
0  0     //the following nos. are path pts.
3  5
7  -1
10  3
15  6
end

pick
13  7        //position of the person
 13  10      //position of the object
20           //no of frames
end

read
15  3  15  7   //position of the reader and its direction of sitting
30 //no. of frames end
Likeliness of the output of the Camera Control Module:
static      //look
12.5  15.0
0.6282
5
12.5  15.0  12.5  12.0
goby          //path
1.7149858 -1.0289916
0.4487143
1.7149858 -1.0289916 0.0 0.0
0.0 0.0 1.7149857 2.8583095
track
54.0
1.047
3.4299717 1.8293179 5.366164 5.056305
3.4299717 1.8293179 1.7149857 2.8583095
1.7149857 2.8583095 3.6511781 6.0852966
pan
5.056305
1.047 5.366164
3.6511781 6.0852966
2.2979367 6.053095
track
54.0
1.047
5.366164 5.056305 35.310883 -41.922836
5.366164 5.056305 2.2979367 6.053095
2.2979367 6.053095 33.595898 -40.893845
// truncated for conciseness ...
static      //pick
12.5  15.0
0.6282
5
12.5  15.0  12.5  12.0
static      //read
13.0  12.0
0.4487143
15
13.0  12.0  10.0  12.0
// truncated for conciseness ...



Library Documentation

Class StoryWrapper

Encapsulates the story in an object. The processing results in a trace of the story which is a representation with no inter-object references. Paths are resolved into walkable segments.

Public methods:
  • StoryWrapper(String storyFile);
    storyFile: name of file containing story
  • boolean makeTrace(String traceFile, String errorFile);
    boolean makeTrace(void);
    A trace of the story in storyFile is written to tracefile.The errors are logged to errorFile. In the second form, the defaults are:
    traceFile=storyFile+".trace";
    errorFile=storyFile+".error";

    Return Value:
    On successful generation of a trace, makeTrace returns true else, it returns false.
Class AnimationTrace

A utility class to write to a file. It is used to write the trace file without explicitly flushing the file buffer.

Public methods:
  • AnimationTrace(String fileName);
    fileName: name of trace file
  • void write(String data);
    The string data is written to the trace file and the stream to the file is flushed.
  • void close(void);
    Closes the underlying file stream.
Class Coord

Coordinate vectors for the XY plane are encapsulated in this wrapper class for float[2]. By overriding the Object.toString() method, printing a coordinate becomes easier.

Public methods:
  • Coord(float X, float Y);
    Coord(String roomName, String XYString);
    X,Y: components of the position vector
    roomName: the name of the room with respect to which the coordinates are specified.
    XYString: a comma seperated string with the X and Y coordinates of the point with respect to the room.
  • String toString(void);
    Return value: A comma-seperated string consisting of the X and Y (absolute) coordinates of the point is returned.
  • float[] getvec(void);
    Return value: floating point vector representation of the coordinate in the absolute frame.
Related files:
./RoomBase.dat : base coordinates of each room. Format:
<room_name> <base_x> <base_y>'\n'
.
.
room_name does not contain any white-space.
Class Entity

The various entities that can appear in a story are represented by instances of this class. Public methods:

  • Entity(int type, int srNo);
    type: the type of object
    srNo: the serial number of the object which can be used to identify it
  • Coord vicinity(void);
    Return value: A Coord object corresponding to a point in the vicinity of the entity.
  • Coord onCoord(void);
    Return value: A Coord object corresponding to a point at which another entity is said to be "on" this.
  • String vertices(void)
    Return value: White-space-seperated string consisting of the four coordinates representing the bounding rectangle of the entity in the obstacle space.
  • String toString(void)
    Return value: String.valueOf(srNo);
Class CPS

Camera Placement System.
Reads the file containng the trace of the animation. Delegates the control to the respective Idiom classes depending on the type of activity.

Public methods:
  • CPS();
    Reads file using StreamTokenizer and creates the corresponding Idiom object;
Public members:
  • StreamTokenizer tok; reads trace from file "activityList.txt"
  • FileWriter fout; writes camera positions to "cameraPositions"
Class PathIdiom implements ViewAngle

Has the heuristics related to the motion of a person hardcoded into it.Given the actual motion details it fills in the details and calls the write functions of the related camera fragment(motion).

Public methods:
  • PathIdiom(StreamTokenizer tok, FileWriter fout);
    StreamTokenizer tok; object linked to input file.
    FileWriter fout; object linked to output file.
    Generates the camera positions and its lens type(field of view) for an activity of moving along a path. The path it takes are supposed to be composed of line segments joined together.
Class GiveIdiom implements ViewAngle

Generates camera positions for the activity of talking.

Public Methods:
  • GiveIdioms(StreamTokenizer tok, FileWriter fout);
    The variables same as that of PathIdiom.
Class LookIdiom implements ViewAngle

Generates camera positions for a scene depicting a person looking in particular direction.

Public Methods:
  • LookIdioms(StreamTokenizer tok, FileWriter fout);
    The variables same as that of PathIdiom.
Class PickIdiom implements ViewAngle

Generates camera positions for action pick.The placement of the camera is Apex to get a better view.

Public Methods:
  • PickIdiom(StreamTokenizer tok, FileWriter fout);
    The variables same as that of PathIdiom.
Class TalkIdiom implements ViewAngle

Generates camera positions for two persons talking.

Public Methods:
  • TalkIdiom(StreamTokenizer tok, FileWriter fout);
    The variable description is the same as that of PathIdiom.
    This method takes two persons talking.It starts with an apex view in the beginning and shifts alternately between the two as they talk, agin shifting to apex when nobody talks.


Path Planning
class obstacle
  • A object of this class will store cordinates of rectangular obstacle.

class cordinate
  • obstacle [] o_space(obstacle ob[], int n)
    Used for calculating co-ordinates of obstacle space.
    Input parameteres: array of objects of type obstacle, number of obstacles.
    Output: array of objects of type obstacle containing the newly calculated cordinates of the obstacle space.
  • int[] range(obstacle ob)
    Input parameter: an object of type obstacle.
    Output: greatest integer of the obstacle co-ordinates. This was used by us to calculate the range of the environment which contains all the obstacles.

class equation
  • float[][] line(float p1[][], float p2[][])
    Input parameters: Two floating points arrays containing co-ordinates of points p1 and p2.
    Output: A floating point array containing the coefficients of line joining p1 and p2. Equation of line is in the form Ax + By = C. A mistake commited by us while writing this method is that co-ordinates of the point is taken in a two dimensional array of size [1][2].

class decomposition
  • Instance variable: points[][][], to store co-ordinates of all points in a cell of the grid after decomposing the configuration space
  • void calculatepoints(int i, int j, obstacle ob[], int n)
    Input parameters: Two integers i,j representing the cell (i,j), array of objects of type obstacle and an integer n representing number of obstacles. End effect: Will compute all the co-ordinates in the grid (i, j) after decomposing the configuration space.

class add
  • void addition(float start[][], float goal[][], decomposition dd[][]) Input parameters: float start,goal containing co-ordinates of starting point and goal point, two dimensional array of objects of type decomposition which will contain co-ordinates of all decomposed points in each cell. End effect: Start and goal co-ordinates are added to the cells.

Class Links
  • Instace variables: float x,y,connect[][].To store co-ordinates of point (x,y) and the co-ordinates of all other points(connect[][]) to which it is visible.
  • void Vgraph(obstacle ob[], decomposition dd, int n)
    Input parameters: array of objects of type obstacle, object of type decomposition and number of obstacles. End effect: Update the array connect mentioned above.

class open
  • Instance variables: float node[][],parent[][],cost1,cost2; open next;
    To maintain a link list(next) of all co-ordinates which are in open mode while performing search by A* algorithim. node stores the co-ordinates of the point which in under consideration, parent contains the co-ordinates of its parent, cost1 is actual cost from start and cost2 is heuristic cost from node to goal.

class close
  • Instanse variables: Similar to that of class open.

class search
  • Instance variables: float start[][], goal[][];
    To store co-ordinates of start and goal.
  • close execution(Links link[][][])
    Input parameters: Three dimensional array of type Links. Output: A link list of objects of type close after performing search operation.
  • close update_close(close first, open expand)
    Input parameteres: A link list of objects of type close(first),object of type open(expand).
    Output: The expanded node (expand)is added in the close link list.
  • open update_open(open n1, open n2, Links link) Input parameters: Two objects of type open(n1 and n2) and an object of type Links.
    Output: A link list of objects of type open after adding the node to be expanded n1 to the link list n1. link should contain the co-ordinates of all points to which node n1 is visibily connected.

class path
  • Instance variables: float co_or[][]; path next;
    To store link list of co-ordinates of all points in the obtained path.
  • path unsmooth(close tail, float start[][], float goal[][])
    Input parameters: A link list of type close, start and goal co-ordinates. Output: A link list of type path containing co-ordinates of all points in path.

class snakes
  • Instance variables: float traj[][][]; int no_of_nodes;
  • float[][] smooth(obstacle ob[], int n, int i)
    Input parameters: Array of objects of type obstacle, integer n representing number of obstacles and integer i representing ith control point whose new co-ordinates has to be calculated after smoothing.
    Output: New co-ordinates of ith point after smoothing.

class root
  • String generate_path(String name, float start[], float goal[])
    Input parameters: String name containing name of file which has co-ordinates of all obstacles. float start[], goal[] will contain co-ordinates of start and goal points respectively.
    Output: A string containing co-ordinates of the smoothened path between start and goal.
Animation

entity.cpp
#define null
#define doing
#define done
typedef char boolean;

struct CameraParams
Members:
float posx, posy, posz;
(x,y,z) coordinates of the camera

float lookx, looky, lookz;
(x,y,z) coordinates of the viewpoint

float fovy;
field of view of the camera
Class Entity

A virtual class for a renderable object defining virtual functions to be inherited by its derivatives.

Public Methods:
  • virtual int render();
    renders the object to the screen Return value: if the object did not perform any action, it returns null. if an action is being performes, returns doing if an action terminated in this call, returns done
  • virtual boolean Do(int action);
    initialises the object to start performing a certain action. Return value: true on successs false otherwise
Class Human:public Entity;
Class Chair:public Entity;
Class Table:public Entity;
Class Book:public Entity;
Class Camera:public Entity;

controller.cpp
Class Controller

From trace and camera placement files, it coordinates the rendering of the scene.

Public methods:
  • Controller(char* actFile, char* camFile);
    actFile: file with the tracetrace of the story camFile: file with camera coordinates for frames
  • boolean nextAction(void);
    reads the next action from actFile and initialises the appropriate graphic Entity object to perform that action.

    Return value:
    on susccessful initialisation, returns true. Else it returns false.
  • struct CameraParams* nextCamera(void);
    Return value: next camera coordinate in canFile.


run.cpp

void idle(void);
this is the method registered as OpenGL's idle function through the call glIdleFunc(idle). It calls the render method of each of the Entity instances created by controller.

Online Links

Human modelling at the University of Penn.:
http://www.cis.upenn.edu/~hms/home.html
The JACK project: http://www.cis.upenn.edu/~hms/jack.html
References to Virtual Human works: http://www.pasociety.org/perfanim
Ken Perlin's IMPROV: http://www.mrl.nyu.edu/perlin

Bibliography

@Article{Badler/Smoliar:1979,
  author=       { Badler, Norman I. and Smoliar, Stephen W.},
  year=         { 1979},
  institution=  { U. of Pennsylvania 2-->National U. of Singapore},
  title=        { Digital Representations of Human Movement},
  journal=      { ACM Computing Surveys},
  month=        { march},
  volume=       { 11},
  email=        { badler@central.cis.upenn.edu; Smoliar@ISS.nus.sg},
  annote=       {
The techniques of representing a human being as a computer generated 
graphic entity are discussed. A brief description of the Labanotation 
used to presicely quantify body postures is given. It is emphasised 
that although direct representation in a 2D space is possible, in the 
general case, the better approach is to construct a 3D model and then 
project it to a plane. Three aspects are discussed:
1 Representations of the human body
   this is done by the following methods:
    stick figures : the simplest and most unimpressive
    surface models: excellent results, but slightly unrealistic and 
                    a fair share of blemishes.
    vomule models : the human body is decomposed into primitive solids
                    such as cylinders, spheres, and ellipsoids.
2 Representation of movement
  Given a model for a human body, the process of animating it involves 
  producing a succession of frames, each slightly different from the 
  previous. This is attained by key frames, where a set of important 
  frames is provided and the intermediate parts are interpolated; by 
  movement functions where labanotation is used to choreagraph the 
  motion; and simulation where the mechanics of the human body are also
  encapsulated in the model.

3 Finally, an architecture for such a system is discussed. 
-k.anandsudhakar feb/2k }

}
@Misc{Geib/Levison/Moore:1994,
  author=       { Geib, Christopher and Levison, Libby and
                  Moore, Michael B.},
  year=         { 1994},
  institution=  { upenn},
  title=        { SodaJack: An Architecture for Agents that Search for 
                  and Manipulate Objects},
  month=        { january},
  email=        { (geib,libby,mmoore)@linc.cis.upenn.edu},
  annnote=      {			
This paper deals with the problem of an agent whose aim is to search 
and undertake manipulation tasks in an environment. They have 
implemented the approach in a system called SODAJACK, which does the 
animation. The agent receives as input high level commands like 
"fetch the scoop" and the system has to figure out the exact low-level 
action to do the job. This involves a knowledge about the possible 
locations of the scoop, plan a route to those, explore them, and then 
finally the act of lifting it up. 
The system has been divided into a hiearchy of three planners that 
respond to the input goal, and give as output the action outline to 
acheive the goal. The task division is like this:
1.search planner, it converts the goals into a plan to search.
2.object specific planner, this relates to each search plan by the 
search planner a particular object and undertakes feasibility tests 
for the action plans generated.
3.hierarchical planner(ItPlans), this supervises the other two and 
delegates the control first to search to get a plan and then to the 
object to make it specific.  -Vikrant Kumar 11/02/2000 }

}
@Misc{Salesin/et.al:1994,
  author=       { Christianson, David B. and Anderson, Sean E. and
                  He, Li-wei and Salesin, David H. and 
                  Weld, Daniel S. and Cohen, Michael F.},
  year=         { 1994},
  instituiton=  { 4U. of Washington; Microsoft Research, Redmond
                  2-->Stanford},
  title=        { Declarative Camera Control for Automatic
                  Cinematography},
  email=        { (dbc1,lhe,salesin,weld)@cs.washington.edu;
                  seander@stanford.edu; mcohen@microsoft.com},
  annote=       {
For long programmers haven't made use of cinematographic principles in 
computer animations. The authors try to fill this gap by making the 
rules of cinematic storytelling lend themselves easily to programming. 
For this purpose they have formalized the rules into a  Declarative 
Camera Control Language(DCCL). Such a thing will be very useful as it 
will allow programs to present a dramatic point of view aesthetically.
The authors first introduce the language of the cinema like the breakup 
of a film into scenes and shots, shots being the smallest unit. Another 
thing is the placement of the camera, which depending on the scene can 
be apex, internal, external or parallel. Cinematograohers have 
identified certain field of views of the shots which give pleasing 
results. And then there are certain constraints on a shot which should 
be satisfied like parallel editing and break movement. Next comes the 
concept of idioms, which is the way cinematographers describe 
situations in a film. DCCL is an attempt to formalize this idiom. 
The DCCL is composed of four basic components fragments, views, 
placements and movement endpoints.Fragment is the time interval during 
which the camera performs a simple motion. A simple shot may comprise 
of one or more fragments.

Next they define the Camera Placement System(CPS). The CPS is a three 
stage pipeline consisting of 
1.the sequence planner
2.the  compiler
3.the heuristic evaluator.
The basic aim of the CPS is to give the camera positions depending on 
the input of the positions of the various interacting entities. And 
the authors have implemented this approach in a video game.
The authors succeed to bring to the highlight the importance of using 
the cinematographic techniques in computer animations so as to make the 
experience more enriching.             -Vikrant Kumar  11/02/2000 }

}
@Article{Lozano/Lozano-Perez:1996,
  author=       { Lozano, Oded Maron Tomas and Lozano-Perez, Tomas},
  year=         { 1996},
  institution=  { 2mit},
  title=        { Visible Decomposition: Real Time Path Planning in 
                  Large Planar Environments},
  journal=      { AI Memo},
  month=        { Januaray},
  www=          { ftp://ftp.ai.mit.edu/pub/users/oded\
                  /papers/planning.ps.Z},
  email=        {oded@ai.mit.edu,tlp@ai.mit.edu},
annote= { This paper deals with the use of visibility graphs to do motion planning. -Rajesh Rajasekar 2/2000} }
@Article{Hoffman/Hopcroft:1987
  author=       { Hoffman, Christoph M. and Hopcroft, John E.},
  year=         { 1987},
  institution=  { Purdue-cs; Cornell-cs},
  title=        { Simulation of Physical Systems from Geometric Models},
  journal=      { IEEE J. of Robotics and Automation},
  month=        { june},
  vol=          { RA-3},
  annote=       {
The mechanics of simulation are discusssed. -k. anandsudhakar feb/2k}

}

Kesari Anandsudhakar, Vikrant Kumar, Rajesh Rajshekhar at IITK