Artificial Intelligence ME 768 Jan-Apr 2000

 
OPTICAL FLOW BASED NAVIGATION USING 
AN ONBOARD CAMERA

Himanshu Arora, Abhishek Tiwari and Tirthankar Bandhyopadhyay

 IIT Kanpur : April 2000


Contents

  • Documentation


  • Motivation

       AUTONOMOUS NAVIGATION of robots in real life complex domains is an interesting and constructive task to accomplish. In order to do this the robot should be able to perceive the relative depths of various objects, dynamic or stationary, in its field of view. The mechanism by which most of the living beings, including humans, do this is through OPTICAL FLOW and STEREO VISION. This was the major motivating factor which led us to go for this project and implement the technique practically on a robot. However, we will be working with MONOCULAR VISION, an interesting domain of further extension of the project is to make navigation more robust by employing STEREO VISION in addition to the techniques described here.

        The project could be of extensive use in AUTONOMOUS SURVEILLANCE ROBOTS, MARS EXPLORERS, UNMANNED VEHICLES(UMVs) and PILOT LESS AIR CRAFTS etc. What we wish to finally come up with is the implementation of AUTONOMOUS NAVIGATION on the Remotely Operated Mobile Platform.


    picture 1
    URL : robotics.jpl.nasa.gov/tasks/scirover/homepage.html
        Imagine a robot that can wander around the campus. Once its task position is specified it can efficiently reach there avoiding all kinds of obstacles, fixed or moving, that it encounters in its path. This requires distinguishing between Global & Non global motion using Optical Flow techniques


    Example:

    Sample Task: To get the optical flow field of a fixed obstacle towards which the robot is heading directly , to analyze this optical flow and to make appropriate decisions about its movement directions in order to avoid the obstacles in its path.

       The following two figures provide a very simple insight into the concept of Optical Flow. These figures exemplify how Optical Flow can be used for detecting obstacles during autonomous navigation. 
    Figure 1


    Figure 2 


    Past Work

       The Techniques described here for Optical Flow field calculations are taken from a paper by J.L. Barron , D.J. Fleet of the Dept. of Computer Science ,University of Western Ontario and S.S. Beauch of the Dept. of Computer Science , Queen's University. The paper describes 9 different optical flow techniques. bibtex entry    The other part of our project includes the interpretation of the Optical Flow Fields. The idea for doing this comes from a paper called Real Time Navigation and Obstacle Avoidance from Optical Flow on a Space Variant Map written by Gregory Baratoff,Christan Toepfer,Heiko Neumann.

        Some help was also taken from "Real Time Optical Flow", the Ph.D. thesis of Ted Camus. 


    Methodology

       In this project we used a lego robot (refer picture 3) which gets visual feedback from the on board camera installed on it and moves forward avoiding obstacles which come in its way.

       We have divided our project into three modules:

    1.  Generating the Optical Flow fields from the motion of the robot.
    2.  Meaningful interpretation of the fields and deciding upon the next step.
    3.  Implementing the commands given by the master processor on a physical robot.


    The following figure shows the block diagram of the process.
     

    OPTICAL FLOW FIELD GENERATION

       As the robot moves forward, the objects around it move towards it with a certain velocity in the 3-D frame. The 2-D approximation of this motion is depicted in change of coordinates of pixels corresponding to the same object in two consecutive frames of image. This motion of the objects in 2-D image is called OPTICAL FLOW FIELD.

    There are two basic techniques used to determine OPTICAL FLOW :
     

    1.  Differential Techniques and
    2.  Region Based Matching.


       In the differential techniques we estimate the velocity of the pixels by computing spatiotemporal derivatives of the image. Here we have to make certain assumptions like

    But often the space and time derivatives cannot be calculated very accurately.

       In Region Based Matching we take two consecutive frames and find the position of pixels in the two images which correspond to the same object. Let us define the image at two instants to be F(x,y) and G(x,y) where F(x,y) and G(x,y) contain the value of intensities of pixel (x,y). To find the best match of the pixel at the position (x,y) in the first image we have to maximize similarity measures like correlation function or minimize distance functions like sum-of-squared difference(SSD) given by:
     
     

    SSDfg(x,y,p,q)=(sum-over-i[sum-over-j{W(i,j)*[F(x+i,y+j)-G(x+i+p,y+j+q)]}])

    where W(i,j) is window function, and we have to minimize over p and q.    We have used Region Based Matching to calculate the optical flow and minimized the SSD function. While finding the optical flow we are using two assumptions : -
     

    Here the blob is 3 X 3 big.



  • The second assumption is that the robot is not moving very fast so that the search for the corresponding pixel is limited to a small area.

  • The search space for pixel A is depicted and the corresponding point found is B .
       The difference between the x & y positions for the corresponding points im the two images are stored into an array and this information is passed to the algorithm to find the time to contact.

    INTERPRETATION OF OPTICAL FLOW FIELDS

       It can be shown by geometry that for objects at the same radial distance from the camera, the optical flow at the periphery of the image is much higher than that at the center . Thus the peripheral flow vectors can be detected reliably even if compressed. Moreover the robot has very little probability of colliding with object whose image is at the periphery. So we may neglect the optical flow vector at the periphery for Obstacle Avoidance. If there is highly diverging field at some point, the robot has danger of collision with the object at that point. So the robot will move away from that point. While moving through a corridor we try to make the fields of both the walls of the corridor equal. This will bring the robot at the center of the corridor.

       In order to analyze the data from the optical flow field we find the time to contact of the obstacles . The inherent assumption is that the robot is moving in the direction along the optical axis.

       The above figure discusses the optical geometry . The point of interest P at coordinates ( X Y Z ) is projected through the focus of projection centered at the origin of the coordinate system ( 0,0,0 ). P is stationary in the real world whereas the origin of the projection moves forward with a velocity of (dZ/dt). As the image plane moves closer to P , the position of p in the image also changes. Using equilateral triangles:

    (x / X) = (y / Y) = (z / Z) ...........eqn.(1)

    y / z = Y / Z ...........eqn.(2)

    yZ=Yz

     Differentiating wrt time,

    yZ' + y'Z = Y'z + Yz' .........eqn.(3),

    z is the distance of the screen from the optical center. So it remains constant as the robot moves forward. Since the robot is moving along the optical axis, Y also remains constant( see the above figure ). So the eqn.(3) becomes,

    (Z / Z') = -(y / y') .........eqn.(4), OR
    time to contact = -(y / y').

    Since we already know the velocity of the robot (i.e. Z'), we can determine the depth from eqn.(4) and then we can determine X and Y from eqn.(1). Thus we can plot Z for each point in the image. Such a plot is called Depth Plot .
    The data used for calculation and for plotting the depth plot is obtained by convolving the original depth plot image with a smoothening filter which quite effectively dampens out the noise we were getting  with the depth data.
     
     

          1       2       1
          2       4       2
          1       2       1
    The smoothening filter used

    Some of the results obtained can be seen below --
     
     


    Robot fixed and the Plank moving laterally wrt the optical axis Robot fixed and the hand moving towards the robot Robot fixed and the Plank moving towards the robot
    Robot directly heading towards a wooden plank Robot moving parallel to the wooden plank Robot moving towards obstacles at different depths

    Note : The obstacle that is nearer has larger optical flow fields

    Robot directly heading towards a wooden plank The corresponding optical flow fields The corresponding depth plot
    The time to contact algorithm intakes the optical flow data and outputs the time taken to make contact with the object present at that particular blob in the image. From the time to contact we find (knowing the exact value of z from calibration ) the value of the depth of the different blobs and plot the depth map.

    This is also used to find the values of X & Y to --

    1. identify the obstacle and
    2. to find the left & right edges of the obstacle.


       All this data is then fed to the decision making algorithm which then determines whether to move right , left , to continue and if so by what amount.

    DECISION MAKING BASED ON OPTICAL FEEDBACK

       Our strategy program makes decisions about what to do next and directs the robot to do so. We use the assumption that the obstacle height is between 5-15 cms and anything lower than this will not be detected by the camera which is a limitation in this case. Another implicit assumption is that the depth map calculated is accurate.    For this we first calculate the depth map and in the image if there is not any obstacle nearer to the robot than a particular threshold (which is decided by the speed of the robot), then it will continue its motion. If there is indeed any obstacle in the range then both the left edge and the right edge is calculated .Since we know the Z of the obstacle we find the X of the edges and if this comes out to be less than a particular region : X_thres to -X_thres the robot moves in such a way that the right edge or the left edge depending upon the case, moves out of this region.    It may also be possible that there may be a passage through which it tries to maneuver. If the passage width is less than the width of the robot the edges both lie in the X_thres to -X_thres. In this case the robot will come to a halt.

       The decision about motions are made and appropriate instructions are passed via the transmitter to a receiver on the robot which are decoded by the microcontroller on the robot. This microcontroller then sends adequate pulses to the motors which in turn drives the wheels and the robot moves as per the decision made by the strategy programme.
     

    IMPLEMENTATION ON HARDWARE
       We are implementing the commands given by the decision making program or strategy program on a physical robot. We are using a robot made of lego whose pictures are given below.

    For documentation

    GRABING IMAGE SEQUENCES

    The grabing of image sequences and changing them into the required buffers in the numerical array
    form was accomplished using the MATROX Imaging card . The source code reffered below first
    allocates memory locations for the various memory buffers and child buffers used and using
    standard Matrox command to display frames continuously at a rate of 30 frames per seconds. Then
    it grabs two frames for further computation and passes on the required buffers to the included
    header "library.h" . The program also computes the time taken to grab the two images .The later part of the code contains standard commands for displaying the buffers processed by the included library.
    Included Libraries :- mil.h
                                      stdio.h
                                      stdlib.h
                                      conio.h
                                      library.h ( This library was written by us )
    Input :- Real Time data from the vision card
    Output:-  Two grabbed Image Buffers and the time elapsed in between the two.
    To see the source code click here.
     

    Finding the flow fields

    The image buffer size recieved by this segment of the program is an array of 320x240 numbers .The estimation of flow fields is done by minimising the ssd of a 8 x 6 block over a 9 x 9 ie 81 matches . The search is done radially outwards

    Finding the Time to Contact and getting the Relative depth plot

    The next segment of the program deals with the computation of the Time to Contact of each 8 x 6
    block using the two arrays and the time information supplied by the above program. It then finds the
    relative depths of each point by multiplying the time with the robot's velocity. It then plots the corresponding depth ploton a scale of 0 - 255.

    Decision making (Strategies)

    This section stores the real world coordinates of the obstacles in an array and finds their edges. It uses this data to decide what to do next and passes this information to the program which sends data to the hardware through the serial port.
    Included Libraries:- mil.h
                                      math.h
                                      time.h
    Input:- Two Image Buffers and the time taken to grab them
    Output :-   Image buffers containing Flow Fields, DepthPlots Original Image and The Decisions taken
    To see the source code click here.

    Sending the intructions to the on board microcontroller:- This is a standard program to send data on serial
    port to the microcontroller. Instructions are sent in the form of bytes.
    Included libraries:- dos.h
                                     conio.h
                                     math.h
                                     ctype.h
                                     stdio.h
                                     stdlib.h
                                     io.h
    Input :-  Decisions made by the above program.
    Output:- Bytes sent to the microcontroller.
    To see the source code click here.
    Execution of instructions :- This program was written in assembly language and was programmed on the
    microcontroller . It decodes bytes recieved through the serial port and sends appropriate instructions to the
    motors.
    Input:- The bytes sent through the serial port.
    Output:- Execution of instructions by the motors.
     

    Sample Input & Expected Output

       The following is the demonstration of what the robot is expected to do after the completion of the project .    Top view displays the target position which the robot is to attain as a pink flag on the right side of the image. The robot is on the left side of the image and is required to reach the target finding its path amidst many fixed obstacles.
     
     

    View1 : The objective of the robot is to reach the pink flag and it starts heading towards the pink obstacle

     

    View2 : The robot turns right to ignore the pink obstacle

     

    View3 : The robot heads towards the white obstacle

     

    View4 : The robot turns left to ignore the white obstacle and aligns itself against the green obstacle

     

    View5 : The robot turns left to ignore the white obstacle and aligns itself against the green obstacle after which it aligns itself in front of the guide way between the white and the the green obstacle

     

    View6 : The robot successfully reaches the target position

    On-Line Links


    Annotated Bibliography


    @PhdThesis{Camus:1995,
            author=                        { Camus,Ted },
            year=                             { 1995},
            keywords =                  { OPTICAL FLOW , TIME TO CONTACT },
            institution =                {Brown University - CS},
            title =                            { Real-Time Optical Flow }
            month =                       { May },
            pages =                         { 130 },
            annote =                      {

    The paper describes three different techniques for computation of Optical Flow in real time . They are
    Gradient Based Optical Flow , Correlation Based Optical Flow and Linear Optical Flow . The intrinsic
    assumption in Gradient Based Optical Flow is that the brightness at a given point in an image is contant
    w.r.t. time and the two dimentional optical flow fields are computed using this assumption . Correlation
    Based technique is the most reliable and robust of the three . The assumption here is that a certain block
    of nearby n x n pixels have the same optical flow fields and the search space of the corresponding block in
    the next frame is limited by a factor , which depends on the velocity of the moving robot or the external
    world . This technique is computationally inefficient because the search space here is quadratic . Linear
    Optical Flow technique is a correspondence search on the time scale ie it searches for the corresponding
    pixel in successive frames at a location specified apriori .
            The paper also describes algorithms for computation of time to contact of each point in the field of
    view . The estimation of the Focus of expansion for different sequence of motion is also dealt with in
    elaborate detail . The paper also deals with various optical flow estimation problems like The Aperture
    problem and The  Temporal Aliasing problem.
                                                                                                                            Abhishek Tiwari
                                                                                                                            Himanshu Arora
                                                                                                                            Tirthankar Bandyopadhyay (4/2000)}

    @Article{Barron/Fleet/Beauchmin:1992,
      author=       { Barron,J.L. and D.J.Fleet and S.S.Beauchemin},
      year  =       { 1992},
      keywords =    { OPTICAL FLOW IMAGE},
      institution=  { UWO-CS/UWO-CS/QU-CS},
      title =       { Performance of Optical Flow Techniques},
      month =       { July},
      pages =       { 81},
      annote=       {
    
    This paper describes various techniques for the computation of Optical Flow fields.
    All these techniques can be broadly classified into two major types namely:
    differential techniques and region based techniques. A few techniques of both types are
    described in this paper. Differential techniques make a broad assumption that either
    the intensity is constant with time or the gradient of intensity is constant with 
    time. This is a crude assumption in the sense that in addition to this assumption we
    have an added disadvantage that the gradients usually cannot be calculated accurately.
    The region based techniques are more reliable since even if the equations we get after
    minimizing the SSD function do not have unique solution, we may use the data from the
    past frames to decide on the pixels which brings out a more closer estimate as
    compared to the differential techniques.
                                                 Himanshu Arora
                                                 Abhishek Tiwari
                                                 Tirthankar Bandhopadhyay (2/2000)}}
    
    
    

    This proposal was prepared by Abhishek Tiwari, Himanshu Arora and Tirthankar Bandhyopadhyay as a part of the project component in the Course on Artificial Intelligence in Engineering in the JAN semester of 2000 . (Instructor : Dr.Amitabha Mukerjee )

     [ COURSE WEB PAGE ] [ COURSE PROJECTS 2000 (local CC users) ]