Detectors and Descriptors
Applications To Have In Mind
Core computer vision applications.
Image alignment: Determining if different images contain parts of the same photo or object viewed from another angle.
Applications To Have In Mind
Building 3D reconstructions from multiple images.
Applications To Have In Mind
Stitching photos taken at different angles to create a panorama.
One Example: Image Alignment
Aligning two given images.
One Solution
Iterate through possible translations to find the best alignment.
Check alignment with images for all pixels within a search range:
python for y in range(-ySearch, ySearch+1): for x in range(-xSearch, xSearch+1): check_alignment_with_images()
One Motivating Example
Aligning images with 3D rotation and translation.
An Alternate Approach
Find corners and features in the images.
Match these features based on local image data.
What Now?
Given pairs p1, p2 of correspondence, how do I align?
Consider translation-only case.
An Alternate Approach
Solve for transformation T (e.g., such that p1 ≡ T p2) that fits the matches well.
Homogeneous coordinates are used.
An Alternate Approach
Blend the aligned images together.
Key insight: Operate on parts of the image rather than the entire image.
Today: Finding Edges and Corners
Focus on finding edges and corners in images.
Edges often occur at object boundaries or changes in surface properties.
Where do Edges Come From?
Edges can arise from depth/distance discontinuities.
Where do Edges Come From?
Edges can arise from surface normal / orientation discontinuities.
Where do Edges Come From?
Edges can arise from surface color / reflectance properties discontinuities.
Where do Edges Come From?
Edges can arise from illumination discontinuities.
Derivatives
Illustrates derivative filters lx and ly.
l_x filter: [-1, 0, 1] (horizontal derivative)
l_y filter: [-1, 0, 1]^T (vertical derivative)
Derivatives
Derivative: Rate of change of a function f(x) at a point, indicating the direction of increase.
Gradient: Vector of all partial derivatives.
What Should I Know?
Gradients are partial derivatives per dimension.
For an n-dimensional input, the gradient also has n dimensions.
Gradients point in the direction of ascent and indicate the rate of ascent.
If a is a minimum of f(x), then \nabla f(a) = 0.
The reverse is not always true, especially in high-dimensional spaces.
Derivatives
Illustrates derivative filters lx and ly.
l_x filter: [-1, 0, 1] (horizontal derivative)
l_y filter: [-1, 0, 1]^T (vertical derivative)
Why Does This Work?
Definition of a partial derivative:
\frac{\partial f(x, y)}{\partial x} = \lim_{\epsilon \to 0} \frac{f(x + \epsilon, y) - f(x, y)}{\epsilon}
Approximation using a finite difference:
\frac{\partial f(x, y)}{\partial x} \approx f(x + 1, y) - f(x, y)
Alternative approximation:
\frac{\partial f(x, y)}{\partial x} \approx \frac{f(x + 1, y) - f(x - 1, y)}{2}
Other Differentiation Operations
Prewitt and Sobel operators are used for edge detection.
Prewitt filters:
Horizontal: [-1, 0, 1; -1, 0, 1; -1, 0, 1]
Vertical: [[1, 1, 1], [0, 0, 0], [-1, -1, -1]]
Sobel filters:
Horizontal: [-1, 0, 1; -2, 0, 2; -1, 0, 1]
Vertical: [[1, 2, 1], [0, 0, 0], [-1, -2, -1]]
Images as Functions or Points
Treating an image as a point in R^{H \times W} or as a function of x, y.
Image gradient:
\nabla I(x, y) = \begin{bmatrix} \frac{\partial I}{\partial x}(x, y) \ \frac{\partial I}{\partial y}(x, y) \end{bmatrix}
\frac{\partial I}{\partial x}(x, y) (often called Ix) represents the horizontal change in intensity at (x, y).
Image Gradient Direction
Examples of gradient directions.
\nabla f = [\frac{\partial f}{\partial x}, 0]
\nabla f = [0, \frac{\partial f}{\partial y}]
\nabla f = [\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}]
Image Gradient
Gradient: direction of maximum change.
Relationship to edge direction.
Image Gradient
Magnitude of the gradient: \sqrt{lx^2 + ly^2}
Image Gradient
Orientation of the gradient: atan2(ly, lx)
Lightness equal to gradient magnitude.
Image Gradient
Orientation of the gradient: atan2(ly, lx)
Image Gradient
Orientation of the gradient: atan2(ly, lx)
Discusses the structure observed in gradient orientations.
Noise
Effect of noise on a row of an image f(x, y).
Noise
Convolution with a derivative filter amplifies noise.
D{i,j} = (I{i,j+1} + \epsilon{i,j+1}) - (I{i,j-1} + \epsilon_{i,j-1})
Where I{i,j} is the true image, and \epsilon{i,j} \sim N(0, \sigma^2).
D{i,j} = (I{i,j+1} - I{i,j-1}) + \epsilon{i,j+1} - \epsilon_{i,j-1}
\epsilon{i,j} - \epsilon{k,l} \sim N(0, 2\sigma^2): Variance doubles.
Noise
Consider a row of f(x, y) (i.e., make y constant).
How can we use the last class to fix this?
Handling Noise
Convolution with a Gaussian kernel followed by differentiation.
\frac{d}{dx}(f * g)
Noisy Input
Noise in 2D, visualized with a zoomed-in meme image.
Ix calculated via [-1, 0, 1] filter.
Noise + Smoothing
Smoothed input image, then applying Ix via [-1, 0, 1].
Let’s Make It One Pass (1D)
Differentiation of a convolution is equivalent to convolving with the derivative of the kernel.
\frac{d}{dx}(f * g) = f * \frac{d}{dx}g
Let's Make It One Pass (2D)
Gaussian Derivative Filter
Which one finds horizontal edges? (The one that is differentiating in the y direction)
Applying the Gaussian Derivative
Applying Gaussian derivative filters with different pixel sizes (1, 3, 7 pixels).
Removes noise but blurs edges.
Compared with the Past
Comparing Gaussian derivative filter with Sobel filter.
Gaussian Derivative:
[1 0 -1; 2 0 -2; 1 0 -1]Sobel Filter:
[[1, 2, 1], [0, 0, 0], [-1, -2, -1]]
Filters We’ve Seen
Comparison of smoothing filters, derivative filters, and Gaussian derivative filters.
Smoothing filters:
Goal: Remove noise
Sums to: 1
Derivative filters:
Goal: Find edges
Sums to: 0
Problems
Still an unsolved problem.
Localizing Reliably
Metaphor for feature distinctiveness.
Guadalupe Street, Intersection of Speedway and Dean Keeton, On campus.
Desirables for Descriptor
Properties of a good descriptor:
Repeatable: Should find the same things even with distortion.
Saliency: Each feature should be distinctive.
Compactness: Shouldn’t just be all the pixels.
Locality: Should only depend on local image data.
Example
Find the correspondences between two images.
Example Matches
Look for the colored squares
Basic Idea
Edge: No change along the edge direction.
Corner: Significant change in all directions.
Flat region: No change in all directions.
Corners should exhibit large intensity changes with any small shift of the window.
Formalizing Corner Detection
Introduction to mathematical formalization.
Formalizing Corner Detection
Zoom-In at x,y, Original Image.
Formalizing Corner Detection
Window at (x, y) and (x+u, y+v).
Measuring Similarity? Zoom-In at x,y.
Formalizing Corner Detection
Error (Sum Sqs) for u,v offset:
E(u, v) = \sum_{(x, y) \in W} (I[x + u, y + v] - I[x, y])^2
Formalizing Corner Detection
Sum of Squares between @u=-2,v=-3 and unshifted.
Formalizing Corner Detection
Error at u=0, v=0 is always 0. Why?
Match The Location and Plot
Original Image and Zoom-In A, B Error Options
Match The Location and Plot
Original Image and Zoom-In A, B Error Options
Match The Location and Plot
Original Image and Zoom-In A, B Error Options
Match The Location and Plot
Original Image and Zoom-In A, B Error Options
Ok But Back To Math
E(u, v) = \sum_{(x, y) \in W} (I[x + u, y + v] - I[x, y])^2
Shifting windows around is expensive! We’ll find a trick to approximate this.
Note: only need to get the gist
Aside: Taylor Series for Images
Taylor series for linearizing a function:
f(x + d) \approx f(x) + \frac{\partial f}{\partial x} d
Applying Taylor series to images:
I(x + u, y + v) \approx I(x, y) + Ix u + Iy v
Ix and Iy are the partial derivatives at (x, y).
Formalizing Corner Detection
E(u, v) = \sum_{(x, y) \in W} (I[x + u, y + v] - I[x, y])^2
\approx \sum{(x, y) \in W} (I[x, y] + Ix u + I_y v - I[x, y])^2
= \sum{(x, y) \in W} (Ix u + I_y v)^2
= \sum{(x, y) \in W} Ix^2 u^2 + 2 Ix Iy u v + I_y^2 v^2
Formalizing Corner Detection
E(u, v) \approx \sum{(x, y) \in W} (Ix^2 u^2 + 2 Ix Iy u v + I_y^2 v^2)
= [u, v] M [u, v]^T
Where M is the second moment matrix.
M = \begin{bmatrix} \sum{x, y \in W} Ix^2 & \sum{x, y \in W} Ix Iy \ \sum{x, y \in W} Ix Iy & \sum{x, y \in W} Iy^2 \end{bmatrix}
Intuitively what is M?
M = \begin{bmatrix} \sum{x, y \in W} Ix^2 & \sum{x, y \in W} Ix Iy \ \sum{x, y \in W} Ix Iy & \sum{x, y \in W} Iy^2 \end{bmatrix}
\approx \begin{bmatrix} a & 0 \ 0 & b \end{bmatrix}
a, b both small: flat
[0.1 0; 0 0.1]One big, other small: edge
[50 0; 0 0.1]or[0.1 0; 0 50]a, b both big: corner
[50 0; 0 50]
Intuitively what is M?
M = \begin{bmatrix} \sum{x, y \in W} Ix^2 & \sum{x, y \in W} Ix Iy \ \sum{x, y \in W} Ix Iy & \sum{x, y \in W} Iy^2 \end{bmatrix}
Image might be rotated by rotation θ!
Intuitively what is M?
M = \begin{bmatrix} \sum{x, y \in W} Ix^2 & \sum{x, y \in W} Ix Iy \ \sum{x, y \in W} Ix Iy & \sum{x, y \in W} Iy^2 \end{bmatrix} = V^{-1} \begin{bmatrix} a & 0 \ 0 & b \end{bmatrix} V
If image rotated by rotation θ / matrix V :
M will look like V^{-1} \begin{bmatrix} a & 0 \ 0 & b \end{bmatrix} V
So What Now?
Calculate M at pixel by summing nearby gradients but need access to a and b.
Given M, decompose it into eigenvectors V and eigenvalues \lambda1, \lambda2 with.
M = V^{-1} \begin{bmatrix} \lambda1 & 0 \ 0 & \lambda2 \end{bmatrix} V
So What Now?
M = \begin{bmatrix} \sum{x, y \in W} Ix^2 & \sum{x, y \in W} Ix Iy \ \sum{x, y \in W} Ix Iy & \sum{x, y \in W} Iy^2 \end{bmatrix} = V^{-1} \begin{bmatrix} a & 0 \ 0 & b \end{bmatrix} V
Compute the quantity R from M:
R = det(M) - \alpha trace(M)^2 = \lambda1 \lambda2 - \alpha (\lambda1 + \lambda2)^2
Empirical value, usually 0.04-0.06
Easy fast formula
Fast sum the diagonal for 2x2.
So What Now?
R tells us whether we’re at a corner, edge, or flat region.
R = det(M) - \alpha trace(M)^2 = \lambda1 \lambda2 - \alpha (\lambda1 + \lambda2)^2
If R >> 0: corner \lambda1 \approx \lambda2 >> 0
If R << 0: edge \lambda1 >> \lambda2 >> 0 or \lambda2 >> \lambda1 >> 0
If |R| ≈ 0: flat \lambda1, \lambda2 \approx 0
What Do I Need To Know?
Need to be able to take derivatives of image
Need to be able to compute the entries of M at every pixel.
Should know that some properties of M indicate whether a pixel is a corner or not.
M = \begin{bmatrix} \sum{x, y \in W} Ix^2 & \sum{x, y \in W} Ix Iy \ \sum{x, y \in W} Ix Iy & \sum{x, y \in W} Iy^2 \end{bmatrix}
In Practice
Compute partial derivatives Ix, Iy per pixel
Compute M at each pixel, using Gaussian weighting w
M = \begin{bmatrix} \sum{x, y \in W} w(x, y) Ix^2 & \sum{x, y \in W} w(x, y) Ix Iy \ \sum{x, y \in W} w(x, y) Ix Iy & \sum{x, y \in W} w(x, y) Iy^2 \end{bmatrix}
In Practice
Compute partial derivatives Ix, Iy per pixel
Compute M at each pixel, using Gaussian weighting w
Compute response function R
R = det(M) - \alpha trace(M)^2 = \lambda1 \lambda2 - \alpha (\lambda1 + \lambda2)^2
Computing R
Visual representation of computing R.
Computing R
Visual representation of computing R.
In Practice
Compute partial derivatives Ix, Iy per pixel
Compute M at each pixel, using Gaussian weighting w
Compute response function R
Threshold R
Thresholded R
Visual representation of a thresholded R.
In Practice
Compute partial derivatives Ix, Iy per pixel
Compute M at each pixel, using Gaussian weighting w
Compute response function R
Threshold R
Take only local maxima (called non-maxima suppression)
Thresholded, NMS R
Visual representation of a thresholded, NMS R.
Final Results
Visual representation of final results.