The Virtual Director project at IIT Kanpur attempts to build a virtual environment and produce actions in it according to a natural language story. This project fits into the big picture beyond the stage where natural language has been parsed for grammar and semantic information, and the spatial data defuzzified. At this point, the input would be in the form of spatial descriptors and action directives which are at a level of complexity below natural language but higher than the most primitive of directives.


Past Work

Work has been done in this field at the Indian Institute of Technology, Kanpur by Dr. Amitabha Mukherjee as a part of the Virtual Director project.

The aspects of anthropometrics and modelling are discussed by Norman Badler and Stephen Smoliar [Badler/Smoliar:1979]. Physical aspects such as the dynamics of complex objects are studied by Hoffman and Hopcroft [Hoffman/Hopcroft:1987]. This will be used to perform character animation.

Norman Badler has developed Jack, a basic human body animation system [Badler]. An extension of the basic Jack systm by Levison [Geib/Libby/Moore:1994] is SodaJack which simulates a soda bar operator. Action planning in the context of search and manipulation of objects is discussed.

Salesin have created a precise language for the otherwise fuzzy process of defining shots and other basic elemnts of cinematography [Salesin/Cohen/Christianson:96]. Camera control to amplify the impact of the story can be based on this convention.

Lozano and Lozano-Perez explain the method of using visibility graphs to perform path planning in a domain with polygonal obstacles [Lozano/Lozano-Perez:1996].

Sample Input-Output

Sample outputs from the Path Planning Module:

The results of motion planning is shown in the following images. In the actual project the input is taken from a file which has the co-ordinates of all the obstacles. The output will be another file which contains the co-ordinates of the path. The images shown below were plotted for better understanding.

The above image shows the obstacles in the envirnonment. There are totally six in number.

This image shows the obstacle space calculated from the obstacles. In the actual case the corners of each obstacle space will be rounded but we have approximated it to be sharp. This approximation facilitates calculation of visibility graph.

This image shows the path(in dark green colour) between start and goal. Start is at (0,0) and goal is at (28.5,45). The path is shown in actual environment.

In this image the same path is shown in configuration space.

Here motion planning is done in the same obstacle space but now goal is changed to(16.5,25). In this image, path obtained is not smoothed.

This image shows the effect of smoothing on the path obtained above. The snake is applied iteratively five times on the path obtained and result is shown above. We believe that a better smoothing can be obtained by varying the coefficients of the external and internal energy functions and the number of times the snake to be applied on the path.

The visible decomposition technique was tested in quite a few obstacle spaces and it gave successfull results. Only the energy fuction of snakes needs to be fine tuned for obtaining better smoothened paths.

The input to the program will be a description of the scene followed by a transcript of the interactions between the various components of the scene.

Man Jack
Book HoggCraig
Table t
Jack at Hall 3 10
t at Hall 7 4
HoggCraig on t
Jack look HoggCraig
Jack pick HoggCraig
Jack read HoggCraig
Input to Camera Control Module:
10   3  12  4   //direction of gazing
45              // no of frame

path     //the pick is interpolated into a goto and a pick.
1.5      //velocity
0  0     //the following nos. are path pts.
3  5
7  -1
10  3
15  6

13  7        //position of the person
 13  10      //position of the object
20           //no of frames

15  3  15  7   //position of the reader and its direction of sitting
30 //no. of frames end
Likeliness of the output of the Camera Control Module:
static      //look
12.5  15.0
12.5  15.0  12.5  12.0
goby          //path
1.7149858 -1.0289916
1.7149858 -1.0289916 0.0 0.0
0.0 0.0 1.7149857 2.8583095
3.4299717 1.8293179 5.366164 5.056305
3.4299717 1.8293179 1.7149857 2.8583095
1.7149857 2.8583095 3.6511781 6.0852966
1.047 5.366164
3.6511781 6.0852966
2.2979367 6.053095
5.366164 5.056305 35.310883 -41.922836
5.366164 5.056305 2.2979367 6.053095
2.2979367 6.053095 33.595898 -40.893845
// truncated for conciseness ...
static      //pick
12.5  15.0
12.5  15.0  12.5  12.0
static      //read
13.0  12.0
13.0  12.0  10.0  12.0
// truncated for conciseness ...

Library Documentation

Class StoryWrapper

Encapsulates the story in an object. The processing results in a trace of the story which is a representation with no inter-object references. Paths are resolved into walkable segments.

Public methods:
  • StoryWrapper(String storyFile);
    storyFile: name of file containing story
  • boolean makeTrace(String traceFile, String errorFile);
    boolean makeTrace(void);
    A trace of the story in storyFile is written to tracefile.The errors are logged to errorFile. In the second form, the defaults are:

    Return Value:
    On successful generation of a trace, makeTrace returns true else, it returns false.
Class AnimationTrace

A utility class to write to a file. It is used to write the trace file without explicitly flushing the file buffer.

Public methods:
  • AnimationTrace(String fileName);
    fileName: name of trace file
  • void write(String data);
    The string data is written to the trace file and the stream to the file is flushed.
  • void close(void);
    Closes the underlying file stream.
Class Coord

Coordinate vectors for the XY plane are encapsulated in this wrapper class for float[2]. By overriding the Object.toString() method, printing a coordinate becomes easier.

Public methods:
  • Coord(float X, float Y);
    Coord(String roomName, String XYString);
    X,Y: components of the position vector
    roomName: the name of the room with respect to which the coordinates are specified.
    XYString: a comma seperated string with the X and Y coordinates of the point with respect to the room.
  • String toString(void);
    Return value: A comma-seperated string consisting of the X and Y (absolute) coordinates of the point is returned.
  • float[] getvec(void);
    Return value: floating point vector representation of the coordinate in the absolute frame.
Related files:
./RoomBase.dat : base coordinates of each room. Format:
<room_name> <base_x> <base_y>'\n'
room_name does not contain any white-space.
Class Entity

The various entities that can appear in a story are represented by instances of this class. Public methods:

  • Entity(int type, int srNo);
    type: the type of object
    srNo: the serial number of the object which can be used to identify it
  • Coord vicinity(void);
    Return value: A Coord object corresponding to a point in the vicinity of the entity.
  • Coord onCoord(void);
    Return value: A Coord object corresponding to a point at which another entity is said to be "on" this.
  • String vertices(void)
    Return value: White-space-seperated string consisting of the four coordinates representing the bounding rectangle of the entity in the obstacle space.
  • String toString(void)
    Return value: String.valueOf(srNo);
Class CPS

Camera Placement System.
Reads the file containng the trace of the animation. Delegates the control to the respective Idiom classes depending on the type of activity.

Public methods:
  • CPS();
    Reads file using StreamTokenizer and creates the corresponding Idiom object;
Public members:
  • StreamTokenizer tok; reads trace from file "activityList.txt"
  • FileWriter fout; writes camera positions to "cameraPositions"
Class PathIdiom implements ViewAngle

Has the heuristics related to the motion of a person hardcoded into it.Given the actual motion details it fills in the details and calls the write functions of the related camera fragment(motion).

Public methods:
  • PathIdiom(StreamTokenizer tok, FileWriter fout);
    StreamTokenizer tok; object linked to input file.
    FileWriter fout; object linked to output file.
    Generates the camera positions and its lens type(field of view) for an activity of moving along a path. The path it takes are supposed to be composed of line segments joined together.
Class GiveIdiom implements ViewAngle

Generates camera positions for the activity of talking.

Public Methods:
  • GiveIdioms(StreamTokenizer tok, FileWriter fout);
    The variables same as that of PathIdiom.
Class LookIdiom implements ViewAngle

Generates camera positions for a scene depicting a person looking in particular direction.

Public Methods:
  • LookIdioms(StreamTokenizer tok, FileWriter fout);
    The variables same as that of PathIdiom.
Class PickIdiom implements ViewAngle

Generates camera positions for action pick.The placement of the camera is Apex to get a better view.

Public Methods:
  • PickIdiom(StreamTokenizer tok, FileWriter fout);
    The variables same as that of PathIdiom.
Class TalkIdiom implements ViewAngle

Generates camera positions for two persons talking.

Public Methods:
  • TalkIdiom(StreamTokenizer tok, FileWriter fout);
    The variable description is the same as that of PathIdiom.
    This method takes two persons talking.It starts with an apex view in the beginning and shifts alternately between the two as they talk, agin shifting to apex when nobody talks.

Path Planning
class obstacle
  • A object of this class will store cordinates of rectangular obstacle.

class cordinate
  • obstacle [] o_space(obstacle ob[], int n)
    Used for calculating co-ordinates of obstacle space.
    Input parameteres: array of objects of type obstacle, number of obstacles.
    Output: array of objects of type obstacle containing the newly calculated cordinates of the obstacle space.
  • int[] range(obstacle ob)
    Input parameter: an object of type obstacle.
    Output: greatest integer of the obstacle co-ordinates. This was used by us to calculate the range of the environment which contains all the obstacles.

class equation
  • float[][] line(float p1[][], float p2[][])
    Input parameters: Two floating points arrays containing co-ordinates of points p1 and p2.
    Output: A floating point array containing the coefficients of line joining p1 and p2. Equation of line is in the form Ax + By = C. A mistake commited by us while writing this method is that co-ordinates of the point is taken in a two dimensional array of size [1][2].

class decomposition
  • Instance variable: points[][][], to store co-ordinates of all points in a cell of the grid after decomposing the configuration space
  • void calculatepoints(int i, int j, obstacle ob[], int n)
    Input parameters: Two integers i,j representing the cell (i,j), array of objects of type obstacle and an integer n representing number of obstacles. End effect: Will compute all the co-ordinates in the grid (i, j) after decomposing the configuration space.

class add
  • void addition(float start[][], float goal[][], decomposition dd[][]) Input parameters: float start,goal containing co-ordinates of starting point and goal point, two dimensional array of objects of type decomposition which will contain co-ordinates of all decomposed points in each cell. End effect: Start and goal co-ordinates are added to the cells.

Class Links
  • Instace variables: float x,y,connect[][].To store co-ordinates of point (x,y) and the co-ordinates of all other points(connect[][]) to which it is visible.
  • void Vgraph(obstacle ob[], decomposition dd, int n)
    Input parameters: array of objects of type obstacle, object of type decomposition and number of obstacles. End effect: Update the array connect mentioned above.

class open
  • Instance variables: float node[][],parent[][],cost1,cost2; open next;
    To maintain a link list(next) of all co-ordinates which are in open mode while performing search by A* algorithim. node stores the co-ordinates of the point which in under consideration, parent contains the co-ordinates of its parent, cost1 is actual cost from start and cost2 is heuristic cost from node to goal.

class close
  • Instanse variables: Similar to that of class open.

class search
  • Instance variables: float start[][], goal[][];
    To store co-ordinates of start and goal.
  • close execution(Links link[][][])
    Input parameters: Three dimensional array of type Links. Output: A link list of objects of type close after performing search operation.
  • close update_close(close first, open expand)
    Input parameteres: A link list of objects of type close(first),object of type open(expand).
    Output: The expanded node (expand)is added in the close link list.
  • open update_open(open n1, open n2, Links link) Input parameters: Two objects of type open(n1 and n2) and an object of type Links.
    Output: A link list of objects of type open after adding the node to be expanded n1 to the link list n1. link should contain the co-ordinates of all points to which node n1 is visibily connected.

class path
  • Instance variables: float co_or[][]; path next;
    To store link list of co-ordinates of all points in the obtained path.
  • path unsmooth(close tail, float start[][], float goal[][])
    Input parameters: A link list of type close, start and goal co-ordinates. Output: A link list of type path containing co-ordinates of all points in path.

class snakes
  • Instance variables: float traj[][][]; int no_of_nodes;
  • float[][] smooth(obstacle ob[], int n, int i)
    Input parameters: Array of objects of type obstacle, integer n representing number of obstacles and integer i representing ith control point whose new co-ordinates has to be calculated after smoothing.
    Output: New co-ordinates of ith point after smoothing.

class root
  • String generate_path(String name, float start[], float goal[])
    Input parameters: String name containing name of file which has co-ordinates of all obstacles. float start[], goal[] will contain co-ordinates of start and goal points respectively.
    Output: A string containing co-ordinates of the smoothened path between start and goal.

#define null
#define doing
#define done
typedef char boolean;

struct CameraParams
float posx, posy, posz;
(x,y,z) coordinates of the camera

float lookx, looky, lookz;
(x,y,z) coordinates of the viewpoint

float fovy;
field of view of the camera
Class Entity

A virtual class for a renderable object defining virtual functions to be inherited by its derivatives.

Public Methods:
  • virtual int render();
    renders the object to the screen Return value: if the object did not perform any action, it returns null. if an action is being performes, returns doing if an action terminated in this call, returns done
  • virtual boolean Do(int action);
    initialises the object to start performing a certain action. Return value: true on successs false otherwise
Class Human:public Entity;
Class Chair:public Entity;
Class Table:public Entity;
Class Book:public Entity;
Class Camera:public Entity;

Class Controller

From trace and camera placement files, it coordinates the rendering of the scene.

Public methods:
  • Controller(char* actFile, char* camFile);
    actFile: file with the tracetrace of the story camFile: file with camera coordinates for frames
  • boolean nextAction(void);
    reads the next action from actFile and initialises the appropriate graphic Entity object to perform that action.

    Return value:
    on susccessful initialisation, returns true. Else it returns false.
  • struct CameraParams* nextCamera(void);
    Return value: next camera coordinate in canFile.


void idle(void);
this is the method registered as OpenGL's idle function through the call glIdleFunc(idle). It calls the render method of each of the Entity instances created by controller.

Online Links

Human modelling at the University of Penn.:
The JACK project:
References to Virtual Human works:
Ken Perlin's IMPROV:


