Storytelling Science: How we navigate

Storytelling Science

How We Navigate
Amitabha Mukerjee

If someone asks you to get up from where you are reading this paper, to outside the room, you can do it very easily, without hitting anything. If asked "How can you do it?", most people will answer - "Because I can see."
But how exactly do we "see"? Until very recently, science did not understand the enormity of this problem. In the 1960's the exciting field called "Artificial Intelligence" had just started up and people were going about trying to build robots to solve problems the same way as humans did. The difficult problems were like those in IQ tests, and problems like navigating past obstacles were thought to be trivial.

The robot Shakey (1970) could only push these blocks. His goal was to stack the blue cube on top of the red cube.

In the 1970's, researchers at Stanford University started the Shakey robot project, where they wanted to solve a reasoning problem with a real robot. They put a camera on a robot, and put the robot in a room with some blocks. The robot had wheels, and could push things. It had a logical planning approach, and could reason out that if you pushed the red block flush with the top of the ramp, then the blue block could be pushed up the ramp to make a stack of blocks. But when the robot was put inside the room, it failed to recognize the blocks, because of shadows and lighting caused the actual light coming from them to be different at different spots. Eventually, Shakey managed to solve the problem, but it used extremely restrictive conditions.

How People See
If you ask yourself - how do I not hit things as I move, perhaps you will say "I know where they are, and I don't go near them." There is some truth in this, but psychologists have shown that this is not exactly how we move. We don't need to know the exact positions for things around us. As images fall on the retina, the changes in these images are being measured, a process that is called Optical Flow
Our eyes use the phenomenon of Parallax to measure distance. If you are going by train, then the nearby trees will be zipping past, whereas those far away will go slower. On the retina, nearby objects cause a faster "optical flow" than distant ones - this tells the eye how far things are. Parallax also tells you how far moving things are - if you are standing still and a car is moving near you it will cause a big optical flow, whereas a truck on a faraway highway will move more slowly.
Supposing a rock is flying at you. Your eyes will signal a flow that is rapidly diverging from the same part of the image. This means something is about to hit you, and you won't wait for the brain to process the images and identify the rock etc. -- well before that, your muscles are instructed to duck, and duck vigorously!

How Robots See
In the Robotics Center at IIT Kanpur, students Adnan Bohari and Vivek Singh have recently built just such a robot - it uses optical flow to avoid obstacles. Of course, the problem isn't quite that simple, even slight undulations on the floor results in problems - the optical flow jitters up and down and have to be corrected, which is the new contribution in the current work.

The robot turning to avoid an obstacle. The optical flow image (right), shows higher velocities (and nearer obstacles) in the left and middle of the visual field, so it decides to turn right.

Are robots at the stage where they can use visual information to navigate in their world? In important ways, e.g. for driving along highways, robots have become very competent. Many such vehicles have driven for long distances, e.g. from New York to Los Angeles, across 3500 Kms of American highway, 98% of the time without a human at the wheel.
But for every day tasks like getting up from the chair and finding its way out of the room, robots still have a long way to go.

Sources
Raphael, B.; 1972, Robots, Chapter 8 in The Thinking Computer, Freeman Publishing, 1972


The robot turning to avoid an obstacle. The optical flow image (right), shows higher velocities (and nearer obstacles) in the left and middle of the visual field, so it decides to turn right.