First of all why are we so fascinated about our ability to perceive depth, or for a layman what does all this mean? After having vision (eyes) for so many years imagine a world without it. Frightening, right? Imagine having sight in just one eye. Most of them will be okay with it and some even ask me, what difference does it make? Now this is really frightening to us; computer vision researchers. We have been chasing this problem since so many decades, many researchers have even spent their entire life in vain trying to decode it and here we have some people who do not know its significance in spite of using it. No problem, what’s this article for, then? There are two major things involved in vision; sight and depth. Many of them fail to distinguish between the two. Sight is the perception of light, and depth is the perception of the space around you. “An experience is worth reading 1000 pages”, so better try it out yourself. Right from the time you get up in the morning spend the entire day closing one of your eyes. Observe if you can live life as easily as you could with two eyes open. (Disclaimer: I own no responsibilities for any accidents that might happen as a result of performing this experiment). But to get a feel of what is driving so many people in pouring so much effort for giving a machine the perception of depth, you got to try it out. Do not read my other posts till you have got at least something from this activity. One experiment that I don’t want you to miss out is here: Hang a rope, a wire, stick, anything from a point such that there is space all around it. Get your fingers ready in the wire grasping position and move your hand towards the wire in the direction perpendicular to it to and grasp it. Remember to close one eye! If you get it right believe me, you are the luckiest person. If not, you would definitely want to know the magic that your brain is doing with two images. That is exactly what all our research concentrates on. Also try judging the depth between two objects placed at different depths with just one eye open. Try experimenting on as many objects as possible. It is impossible for you to know the distance between two objects without opening two eyes, except from monocular cues (I will come to this later). If you think about it carefully, there is nothing new I am talking of. When I say one eye, it is equivalent to taking an image from a camera. In a camera image the 3D surrounding is projected on to a 2D surface. From just this projection it is impossible to know at what depth the object was originally. Take a look at the image below. Square and circle are two objects in front of the sensor. Assume they are initially placed at (circle) 10m and (square) 15m. Their projection on the sensor would be as shown at the right. Try placing the circle anywhere along its line and also the square along its straight line. Do you see any difference in the place where they are projected? Not at all, you get the same image irrespective of where the two are relative to each other along their respective lines. Some people argue with me saying you should definitely be able to observe the change in the size of the object on the sensor as it moves far away from the sensor, so in some way you know whether the object is far or near. I totally agree, but what difference does it make? Who knows the size of the objects? I just have its projection with me at a particular instance of time and nothing else. When I move the object closer to the sensor, the size of the object definitely increases, but here we are talking about depth between two objects, which our brain accomplishes with two images. Even if the size of the object changes as you move it away or towards the sensor how does it give you the absolute depth of the object? We can always solve for two distances and sizes of objects such that one is big and far from the sensor and the other small and closer to it, both giving the same projection. Looking at the sensor you never know where the objects were because you don’t know their actual sizes!
When you look at a photograph you almost get to know the depth associated with it due to a lot of monocular cues that your brain uses along with the knowledge gained over the years. I will have a separate post on monocular cues, so wait for that.
No comments:
Post a Comment