I know the topic on focus was too much to digest and keep up your concentration, so I decided to switch the topic a bit towards motion detection. I have only covered less than half of the full focus story, so will come back to it at a later time when I will explain one of my techniques to solve stereo correspondence.
Even though we have been able to build GHz processors and parallel computing systems, we are having a tough time matching the processing power of our brain. One reason for this is that our brain selectively processes the required information which we fail to do. On taking an image we do not know what region of the image has to be processed and so end up figuring out what each and every pixel present in huge megapixel image can mean or form. But that's not what our brain does. It selectively puts its power in only those regions where it is required the most. For example the recognition is performed as explained earlier only in the region of the fovea. Our brain will put its concentration on the rest of the regions only when some event is detected. This event is motion. A lot of smaller creature's are specialized mainly in this kind of processing which gives their even smaller brains the power of vision.
Motion detection is a concept that has been exploited in computer vision also. Then why are we still behind? When I say motion there are basically two things; motion caused due to our visual system being in motion (which involves our body motion also) and motion in the surrounding. How do we differentiate between the two? Motion in the surrounding is always limited to a certain region in space, and this small region motion detection raises an interrupt and draws the attention of our brain towards it. Motion detection is not the only thing that interrupts the brain, in fact the system that generates this interrupt doesn't even know what motion is! What it is only concerned with, is whether there was a change or not. So even a sudden flash of light can interrupt your brain, even though it is not moving.
In order to detect this kind of a change in computer vision systems we try to diff the current frame with the previous one. Any change would reflect as an edge in the resultant image which would give us the location of the change. This detected change is not necessarily an interrupt to our system because we still spend time looking for this edge in the entire image. Our computer vision systems no doubt capture the surrounding in parallel at once but the processing still takes place serially, which forces it to take a back seat compared to our brain.
No comments:
Post a Comment