Saturday, April 21, 2007

Computer Vision (20)

Even though a lot of people believe that a stereogram is exactly equivalent to seeing with both the eyes there is one major difference. Stereograms are generally shot by moving the camera horizontally by a short distance (in case of a single camera system) or by keeping two cameras side by side, which capture the horizontal disparity. Suppose there are two infinitely long horizontal bars, one at a certain distance from the other (both horizontally and vertically) and nothing else around it and you take a stereo image of this, with the camera taking the projection of their lengths, you will fail to capture the horizontal disparity, because there is none in this direction.

A camera takes the horizontal projection of objects (horizontal line pointing towards you), and so the distance between the horizontal bars along this direction cannot be shown in this 2D image. The vertical distance between them is ‘v’. In other words, if we try to capture this 3D setup in a stereo image pair to get the horizontal depth between the bars you will end up with exactly the same image in the left and right. There is no use seeing it stereoscopically, because, which point in the two images will the brain correspond? Since the camera is moved horizontally, the vertical distance between the two bars remains the same in the stereo image.

In a real scenario, how do our eyes and brain together manage to catch the right point? I mean, form a triangle and get the depth out of it. This is possible because, in addition to just 2D projection our eyes collect in real time one more parameter; focus. Focus is exactly the same as accommodation that I was describing in monocular cues. Our eye has to accommodate itself to focus (see sharply) objects at different depths. When object at one depth is seen sharply depending on the aperture of our eyes objects at other distance will be blur. This means that focus or accommodation is dependent on depth and unique for every distance from the eye. So the accommodation value would actually give the absolute depth of them object.

No comments: