Friday, April 27, 2007
Computer Vision (21) and Optics
Monday, April 23, 2007
Photography and Travel
Kumara Parvatha, also known as KP is one of the tallest peaks in Karnataka. It is a real challenge to trek this in a day. The interesting part is that you can get down from a different route to the one you took to climb. This makes the trek even more interesting. This photo was taken at the top, where we reached a bit late in the morning (the next day we started to climb :( ). Never the less it is heaven out there at any time. This place is situated near subramanya. You can get more info on it here: http://www.kumaraparvathaconquered.blogspot.comSaturday, April 21, 2007
Computer Vision (20)
A camera takes the horizontal projection of objects (horizontal line pointing towards you), and so the distance between the horizontal bars along this direction cannot be shown in this 2D image. The vertical distance between them is ‘v’. In other words, if we try to capture this 3D setup in a stereo image pair to get the horizontal depth between the bars you will end up with exactly the same image in the left and right. There is no use seeing it stereoscopically, because, which point in the two images will the brain correspond? Since the camera is moved horizontally, the vertical distance between the two bars remains the same in the stereo image.
Photography
To give the readers a change from the technical stuff I had thought of posting some other things also, so here it is. This is a place called Devarayana Durga in Tumkur around 70 Km from Bangalore. Its a nice place to visit for a day, but it was already 4 when we left Bangalore and so reached there just right at the sunset. Managed to capture the last few glimpses of the sun for that day. If you guys plan to go there better leave a bit early.Sunday, April 15, 2007
Computer Vision (19)
The red lines are traced when the eyes combine the rectangle and the green lines when they combine the circle. The point of intersection of the red lines gives the 3D location of the rectangle and the green lines that of the circle. As mentioned earlier the circle is in front of the rectangle when cross viewed. One of the points for the formation of the triangle comes from the point of intersection of either the red or the green lines and the other two points are the two eyes. The distance of the point of intersection of lines from the two eyes (d), depends on the separation between the images, so the absolute distance of the objects remains unknown in the stereo image pair. The relative depth of different objects from one another is obtained by corresponding objects form the two images, which moves the point of intersection of the lines according to the 3D placement of the objects (similar to red and green lines).
Saturday, April 14, 2007
Computer Vision (18)
The 3D interpretation of it is as follows. The image seen below is the top view of the 3D space whose 2D projection is shown above. On cross viewing it, you would see the circle in front of the rectangle. Cross viewing a stereogram means, your left eye would see the image on the right and your right eye the one on the left.
The dotted lines are the angle of view of the eyes (not to scale). The blue lines are the projection lines of the objects on the respective eyes. Since the eyes are placed at some distance from one another the projection of the objects in 3D space will always be different on both the eyes, except when the objects are on the vertical bisector. This difference in the projection lengths is what disparity is.Friday, April 13, 2007
Computer Vision (17)
Motion parallax which is a monocular cue is conceptually similar to stereovision which is a binocular cue, in the sense that both of them are perceived due to disparity. In case of motion parallax, to perceive depth along a particular direction you have to move parallel to it. When you are moving in a train, you only capture horizontal disparity between the objects, in the same way as in stereovision we perceive disparity in the direction parallel to the line along which our eyes are placed at that point of time. The first image below is a stereo image pair made into a gif and the second shows motion parallax.

A good animation on motion parallax:http://psych.hanover.edu/KRANTZ/MotionParallax/MotionParallax.html Motion parallax is in fact mimicking stereovision but at two different instants of time. Imagine that instead of me, I placed a video camera and shot my train journey. If I extracted any two consecutive frames from it I would have got a stereo image pair, one taken after a small delay delta compared to the other. In the case of our eye these two images are captured at the same instant of time, while in the motion parallax case it is equivalent to moving the camera to the second eye’s place to capture the second of the stereo image pair. So, when disparity can solve for depth between a stereoimage pair, why not in case of motion parallax?
You can get more stereo images as shown above here: http://www.well.com/~jimg/stereo/stereo_list.html
Thursday, April 12, 2007
Computer Vision (16)
All the above kinds of depth perception require two images or in other words two eyes, and disparity forms the main cue to perceive depth. Such cues that the brain uses are called binocular cues. There is another category of cues that our brain uses a lot to guess depth in single images, known as monocular cues. Monocular cues are the result of the enormous amount of knowledge our brain has acquired over the years. Monocular cues help us to perceive depth in 2D images. Some links to know more about monocular cues:
Monday, April 9, 2007
Computer Vision (15)
In order to get a triangle out of an object or a point we have to find the corresponding matches of that point in the two images. This process is called stereo correspondence. The one reason I love this field is because it has no standards restricting you in any way. You just need to understand the problem and then you are free to come up with solutions and techniques to solve it. The problem in front of you is; for each and every point in one image how do you find the corresponding point in the other. I want to keep your minds fresh and open for new ideaz, so I won’t be detailing on the currently available techniques, because there are not one or two, but many! I strongly believe that to solve the problems of nature you just need to have an open mind to think in new ways. I want you all to give a deep thought on this problem before even trying to google for what’s already been cooked. I can assure that you still have a chance to come up with your own perfect recipe even though it’s been worked out since ages.
This was the end of my introduction to “depth perception through stereo imaging”. As I dive much deeper into this problem, try to think about different ways in which you can solve it. As you go on reading my posts from now on you will find that a lot of good techniques that you had thought about wouldn’t really work in many cases. I will reveal the different dimensions to solving this problem along with the merits and demerits of each of them. Also open up a parallel thread and try to know what all people have been able to think of till now (You will get to know that you are not far off).
Saturday, April 7, 2007
Computer Vision (14)
Knowing the distance between the eyes (D) and the angle of view of both the eyes (theta1 and theta2), we can always extend the two lines (shown dotted) to meet at a point (O). The perpendicular drawn from O to the line joining the eyes is the depth of the object from your eyes. Since the two eyes always see a common point, the lines emanating from them always converge and make sure that a triangle is formed for an object anywhere in the common 3D space. Friday, April 6, 2007
Computer Vision (13)
The below diagram gives the 2D projection of the 3D environment shown above, that your eyes send to your brain. For the right eye the image of the square is always to its leftmost followed by the ellipse and the circle. For the left eye the image of the square is always to its rightmost followed by the ellipse and the circle. The left column in the image below is the image captured by the left eye (objects are marked with an ‘l’ on top), the right column is the image formed by the right eye (objects are marked with an ‘r’ on top) and the central column is the combined image formed in the brain (‘l’ is the image that has come from the left eye and ‘r’ is the image that has come from the right eye). ‘op’ in the diagram means the overlap point, the region where the two images are combined, which in our case is the macula. Let me explain it in 3 different cases:- When your eyes look at the square, the square is the region of overlap in the brain and therefore the square forms the center. Other objects are moved to the sides as named in the diagram. Imagine sliding the left and the right images close to each other such that the squares are placed one over the other.
- When the eyes look at the ellipse, the ellipse forms the center, which is obtained by sliding the two images more towards each other so that the ellipse forms the center. Here the square from the right eye and the circle from the left eye and the circle from the right eye and the square from the left eye overlap each other. They are shown one above the other for the sake of clarity. How does our brain deal with the overlap of dissimilar objects? Will it average the two or suppress one of them? This is again binocular rivalry about which I will be posting later.
- When the eyes see the circle, the circle forms the center, which is obtained by sliding the two images further towards each other to overlap on the circle.
Monday, April 2, 2007
Photography with Computer Vision (12)
Go to the above link and observe the images before you move further. The concept works like this; each color filter that your glasses have should match the color component preserved in one of the two images. The single image that you see in this link is created by taking red component form one image and green component from the other and overlapping them. For example if you are using a red-blue glass combination, one of the images should have blue and not red component in it and the other should have red and not blue component. I assume that you all know a color image is a mixture of three layers; Red, Green and Blue. When you overlap the two components it will look blurry without the glasses due to the disparity present in stereo images. When you look through these glasses one of the components in the image will be filtered by each of the eye piece and so the same image will not reach both the eyes, which your brain resolves to perceive depth. It is equivalent to seeing two images either crossed or parallel. Now you know why they give you these colored glasses when you go to watch a 3D movie.
People interested in photography can take their own 3D photographs using their single 2D camera. Landscape photographers would be very excited to take such photographs, because it is very difficult to reproduce the 3D landscape effect in a 2D photo. Here are some of the ways to do it.
Sunday, April 1, 2007
Computer Vision (11)
Once you have learnt to see stereograms, there is a small observation you will have to make. You can actually do it in the above diagram itself. The gap between the circle and the rectangle is not the same in the two images. This change in the relative distance between two objects in the images is what is called disparity. Solving for depth between the two images is actually solving for this disparity. So you cannot overlap these two images one above the other to fit both the objects perfectly. When your brain combines these two images the disparity that exists between them is converted to depth. If your observation is very keen you will also be able to observe that when your brain combines the rectangles the circles would have not overlapped perfectly and when your brain combines the circles the rectangles would have overlapped at an offset. You cannot combine two objects at different depths at the same time in your brain.
The gray color in the overlapped object is shown just to highlight the partial overlap that takes place for objects at different depths other than what is viewed. Our brain does not average the colors, so you won’t see this gray in the single image that your brain creates; instead it will either be the circle from the left image or the right one. This is called binocular rivalry and I want to have a separate post to explain this concept.
