fullscreen

timer

qrcode

plickers

selector

edit

reset

Flashed Face Distortion

Local Image Features

COS 351 - Computer Vision

[ slides: Derek Hoiem, Grauman and Leibe 2008 AAAI Tutorial ]

image representations

Templates
- intensity, gradients, etc.
Histograms
- color, texture, SIFT descriptors, etc.

image representations: histograms

Global Histogram

Represent distribution of features
- color, texture, depth, ...

[ image: Dave Kauchak ]

image representations: histograms

Histogram: Probability or count of data in each bin

Joint Histogram
- requires lots of data
- loss of resolution to avoid empty bins
Marginal histogram
- requires independent features
- more data/bin than joint histogram

[ image: Dave Kauchak ]

image representations: histograms

Clustering: Use the same cluster centers for all images

[ image: Dave Kauchak ]

what do we compute histograms of?

Color
Texture (filter banks or HOG over regions)

L*a*b* color space

HSV color space

what do we compute histograms of?

Different local feature descriptors:

Scale-Invariant Feature Transform (SIFT)
Speeded Up Robust Features (SURF)
Shape Context
Geometric Blur
Histogram of Oriented Gradients (HOG)
Gradient Location and Orientation Histogram (GLOH)
...

[ SIFT—Lowe IJCV 2004 ]

local descriptors: SIFT

SIFT vector formation:

Computed on rotated and scaled version of window according to computed orientation and scale
- resample the window
Based on gradients weighted by Gaussian of variance half the window (for smooth falloff)

local descriptors: SIFT

SIFT vector formation:

4x4 array of gradient orientation histogram weighted by magnitude (showing only 2x2 below)
8 orientations times 4x4 array = 128 dimensions
Motivation: some sensitivity to spatial layout, but not too much

local descriptors: SIFT

Ensure smoothness:

Gaussian weight
Trilinear interpolation
a given gradient contributes to 8 bins: 4 in space times 2 in orientation

local descriptors: SIFT

Reduce effect of illumination:

128-dim vector normalized to 1
Threshold gradient magnitudes to avoid excessive influence of high gradients
- After normalization, clamp gradients > 0.2
- Renormalize

local descriptors: SURF

Fast approximation of SIFT idea

Efficient computation by 2D box filters and integral images
\(\Rightarrow\) 6 times faster than SIFT
Equivalent quality for object identification

GPU implementation available, link

Feature extraction @ 200Hz (detector + descriptor, 640x480 image)

[ Bay ECCV 2006. Cornelis CVGPU 2008. K. Grauman, B. Leibe ]

local descriptors: shape context

Count the number of points inside each bin
Log-polar binning
- more precision for nearby points
- more flexibility for farther points

[ Belongie and Malik ICCV 2001. K. Grauman, B. Leibe ]

local descriptors: shape context

local descriptors: geometric blur

Compute edges at four orientations
Extract a patch in each channel
Apply spatially varying blur and sub-sample

[ Berg and Malik CVPR 2001. K. Grauman, B. Leibe ]

local descriptors: self-similarity

[ Matching Local Self-Similarities across Images and Videos, Shechtman and Irani, 2007 ]

local descriptors: self-similarity

[ Matching Local Self-Similarities across Images and Videos, Shechtman and Irani, 2007 ]

local descriptors: self-similarity

[ Matching Local Self-Similarities across Images and Videos, Shechtman and Irani, 2007 ]

learning local image descriptors

[ Winder and Brown 2007 ]

matching local features

Nearest neighbor (Euclidean distance)
Threshold ratio of nearest to 2nd nearest descriptor

[ Lowe IJCV 2004 ]

choosing a descriptor

Again, need not stick to one
For object instance recognition or stitching, SIFT or variant is a good choice

things to remember

Descriptors: robust and selective

spatial histograms of orientation
SIFT

×