[ flickr ]

Stereo Introduction

COS 351 - Computer Vision

[ slides: Kristen Grauman ]

review

In previous section:

Feature detection and matching
- keypoint detection (find repeatable and distinctive)
- local descriptors
- correspondence
Model fitting and outlier rejection
- optimization
- Hough transform
- RANSAC

multiple views

This section: multiple views

Today: intro and stereo
Next: Camera calibration
Then: Fundamental matrix

multiple views

stereo vision
structure from motion
optical flow

[ Hartley and Zisserman ]

multiple views: why?

Structure and depth are inherently ambiguous from single views

[ images: Lana Lazebnik ]

multiple views: why?

Structure and depth are inherently ambiguous from single views

Cues

What cues help us to perceive 3D shape and depth??

cues: shading

[ figure: Prados and Faugeras 2006 ]

cues: focus/defocus

Images from same point of view, different camera parameters

3D shape / depth estimates

[ figures: H. Jin and P. Favaro 2002 ]

cues: texture

[ A.M. Loh. The recovery of 3-D structure using visual texture patterns (PhD thesis) ]

cues: perspective effects

[ image: S. Seitz ]

cues: motion

[ figures: L. Zhang, cylinder ]

cues: occlusion

[ Rene Magritt'e famous painting Le Blanc-Seing (literal translation: "The Blank Signature") roughly translates as "free hand" or "free rein" ]

If stereo were critical for depth perception, navigation, recognition, etc., then this would be a problem

multi-view geometry problems

Structure: given projections of the same 3D point in two or more images, compute the 3D coordinates of that point

[ slide: Noah Snavely ]

multi-view geometry problems

Stereo correspondence: given a point in one of the images, where could its corresponding points be in the other images?

[ slide: Noah Snavely ]

multi-view geometry problems

Motion: given a set of corresponding points in two or more images, compute the camera parameters

[ slide: Noah Snavely ]

human eye

Rough analogy with human visual system:

Pupil/iris: control amount of light passing through lens

Retina: contains sensor cells, where image is formed

Fovea: highest concentration of cones

[ figure: Shapiro and Stockman ]

human stereopsis: disparity

Human eyes fixate on point in space: rotate so that corresponding images form in centers of fovea

human stereopsis: disparity

Disparity occurs when eyes fixate on one object; others appear at different visual angles

[ adapted from David Forsyth, UC Berkeley ]

random dot stereograms

Béla Julesz 1960: Do we identify local brightness patterns before fusion (monocular process) or after (binocular)?

To test, pair of synthetic images obtained by randomly spraying block dots on white objects

[ wikipedia ]

random dot stereograms

[ Forsyth and Ponce ]

random dot stereograms

[ Forsyth and Ponce ]

random dot stereograms

When viewed monocularly, they appear random; when viewed stereoscopically, see 3D structure.
Human binocular fusion not directly associated with the physical retinas; must involve the central nervous system (V2, for instance)
Imaginary "cyclopean retina" that combines the left and right image stimuli as a single unit
High level scene understanding not required for stereo
But, high level scene understanding is arguably better than stereo