fullscreen
timer
selector
edit
reset
Marc L - minimountainchurch
Marc L - minimountainchurch
flickr ]

Stereo Introduction

COS 351 - Computer Vision

[ slides: Kristen Grauman ]

review

In previous section:

  • Feature detection and matching
    • keypoint detection (find repeatable and distinctive)
    • local descriptors
    • correspondence
  • Model fitting and outlier rejection
    • optimization
    • Hough transform
    • RANSAC

multiple views

This section: multiple views

  • Today: intro and stereo
  • Next: Camera calibration
  • Then: Fundamental matrix

multiple views

  • stereo vision
  • structure from motion
  • optical flow


[ Hartley and Zisserman ]

multiple views: why?

Structure and depth are inherently ambiguous from single views



[ images: Lana Lazebnik ]

multiple views: why?

Structure and depth are inherently ambiguous from single views




Cues

What cues help us to perceive 3D shape and depth??

cues: shading




[ figure: Prados and Faugeras 2006 ]

cues: focus/defocus

Images from same point of view, different camera parameters

3D shape / depth estimates

[ figures: H. Jin and P. Favaro 2002 ]

cues: texture

cues: perspective effects

[ image: S. Seitz ]

cues: motion

[ figures: L. Zhang, cylinder ]

cues: occlusion

[ Rene Magritt'e famous painting Le Blanc-Seing (literal translation: "The Blank Signature") roughly translates as "free hand" or "free rein" ]

If stereo were critical for depth perception, navigation, recognition, etc., then this would be a problem

multi-view geometry problems

Structure: given projections of the same 3D point in two or more images, compute the 3D coordinates of that point

[ slide: Noah Snavely ]

multi-view geometry problems

Stereo correspondence: given a point in one of the images, where could its corresponding points be in the other images?

[ slide: Noah Snavely ]

multi-view geometry problems

Motion: given a set of corresponding points in two or more images, compute the camera parameters

[ slide: Noah Snavely ]

human eye

Rough analogy with human visual system:

Pupil/iris: control amount of light passing through lens

Retina: contains sensor cells, where image is formed

Fovea: highest concentration of cones

[ figure: Shapiro and Stockman ]

human stereopsis: disparity

Human eyes fixate on point in space: rotate so that corresponding images form in centers of fovea

human stereopsis: disparity

Disparity occurs when eyes fixate on one object; others appear at different visual angles

[ adapted from David Forsyth, UC Berkeley ]

random dot stereograms

Béla Julesz 1960: Do we identify local brightness patterns before fusion (monocular process) or after (binocular)?

To test, pair of synthetic images obtained by randomly spraying block dots on white objects

wikipedia ]

random dot stereograms

[ Forsyth and Ponce ]

random dot stereograms

[ Forsyth and Ponce ]

random dot stereograms

  • When viewed monocularly, they appear random; when viewed stereoscopically, see 3D structure.
  • Human binocular fusion not directly associated with the physical retinas; must involve the central nervous system (V2, for instance)
  • Imaginary "cyclopean retina" that combines the left and right image stimuli as a single unit
  • High level scene understanding not required for stereo
  • But, high level scene understanding is arguably better than stereo

stereo photography and stereo viewers

Take two pictures of the same subject from two slightly different viewpoints and display so that each eye sees only one of the images

Invented by Sir Charles Wheatstone, 1838

[ image: fisher-price.com ]









[ Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923 ]




autostereograms

Autostereograms exploit disparity as depth cue using single image. (single image random dot stereogram, single image stereogram)

(answer at end of slides)
(answer at end of slides)
[ images: magiceye.com ]

estimating depth with stereo

Stereo: shape from "motion" between two views

We'll need to consider:

  • Info on camera pose ("calibration")
  • Image point correspondences

stereo vision

two cameras, simultaneous views; single moving camera and static scene
two cameras, simultaneous views; single moving camera and static scene
single camera with mirror; multiple views; reconstruction
single camera with mirror; multiple views; reconstruction

camera parameters

Extrinsic parameters: camera frame 1 \(\leftrightarrow\) camera frame 2

  • rotation matrix, translation vector

Intrinsic parameters: image coordinates relative to camera \(\leftrightarrow\) pixel coordinates

  • focal length, pixels sizes (mm), image center point, radial distortion parameters

We'll assume for now that these parameters are given and fixed

geometry for a simple stereo system

Assume parallel optical axes, known camera parameters (i.e., calibrated cameras)

geometry for a simple stereo system

Assume parallel optical axes, known camera parameters (i.e., calibrated cameras)

What is expression for \(Z\)?

Similar triangles \((p_l, p, p_r)\) and \((o_l, p, o_r)\):

\[\frac{T + x_l - x_r}{Z - f} = \frac{T}{Z}\] \[Z = f \frac{T}{x_r - x_l}\]

Disparity: \(x_r-x_l\)

depth from disparity

So if we could find the corresponding points in two images, we could estimate relative depth...



image \(I(x,y)\); disparity map \(D(x,y)\); image \(I'(x',y')\)
image \(I(x,y)\); disparity map \(D(x,y)\); image \(I'(x',y')\)

\[(x',y') = (x+D(x,y), y)\]

autostereograms

[ images: magiceye.com ]
×