Structure from Motion

This project aims to estimate structure (3D position for sparse scene points) and motion (camera parameters for images), given a sequence of images.

SfM

Getting Started

Camera Model

ACTS: Automatic Camera Tracking System

Bundler: Structure from Motion (SfM) for Unordered Image Collections

Assignments:

Run ACTS on the circle sequence.
Take the estimated focal length of ACTS as input to run Bundler on the circle sequence.
Compare the two results, with respect to running time, number of tracked points and reprejection error (distance between the reprejection of the estimated 3D point into image and the corresponding 2D feature detected on image), etc.

Preliminaries

RANSAC

DLT: Direct Linear Transformation

LM: Levenberg-Marquardt Nonlinear Least Squares Algorithm

Planar Transformation

Homography

Assignments:

Estimate the homography between two planes from given matched points: img1.jpg, pts1.txt, img2.jpg, pts2.txt, match.jpg
1. Use DLT to estimate homography.
2. Use LM to refine the estimate:
Warp one plane to another through the estimated homography matrix (example). Assume that the estimated homography matrix H brings x₁ (point in img1) to x₂ (point in img2), i.e., x₂ = Hx₁, then each pixel x in the warped image (from img1 to img2) takes place at H^-1x in img1.

Tip: for linear algebra operations (e.g., solving linear system of DLT, inversing the homography matrix, etc), some linear algebra packages like clapack are available.

Camera Calibration

A Flexible New Technique for Camera Calibration

Camera Calibration Toolbox for Matlab

Assignments:

Run Camera Calibration Toolbox for Matlab on calib_img.rar.
Run 'Undistort image' to verify the result. Observe whether the distorted lines become straight after undistortion.
Change the 'est_dist' to put some distortion parameters out of estimation and observe the undistortion result again. Sometimes not using all the 5 distortion parameters will obtain better result, even if reprejection error gets larger.
(*)Implement Zhang's method. Use the corners extracted by Camera Calibration Toolbox for Matlab as input, and compare the results with the matlab implementation.

Two-view Reconstruction

Epipolar Geometry

Eight-Point Algorithm

Five-Point Algorithm

Triangulation

Assignments:

Select two images from the circle sequence.
Extract the feature matches between them from the output file (*.act ) of ACTS generated at the begining of this training project. ACTS文件格式.docx
Estimate the fundamental matrix:
1. Use Eight-Point Algorithm within RANSAC framework to estimate the fundamental matrix F and detect outlier matches similtaneously.
2. Run LM on inliers to refine F:
  
  where d(x,l) is the distance from 2D image point x to 2D image line l.
Perform the metric two-view reconstruction:
1. Use the focal length estimated by ACTS to calibrate the 2D features.
2. Use Five-Point Algorithm within RANSAC framework on the matches of calibrated feature points to estimate essential matrix and detect outlier matches similtaneously.
3. Recover the relative pose (R, t) from the estimated essential matrix.
4. Run LM on inliers to refine (R, t):
  
  where x' is the calibrated position of x.
Triangulate 3D point via the estimated (R,t) and measurements x'₁, x'₂:
1. Initialize 3D position X by DLT.
2. Refine X by LM:
  where C₁=(I,0), C₂=(R,t), and π(C,X) project a 3D point X onto the calibrated image plane through camera motion C.
Save the two-view reconstruction result as *.act and use ACTS to observe the result.
Select another two images and repeat step 2~6. Pay attention to the affect of baseline changes on the reconstruction results.

Tips:

Bundler includs the following algorithms, which can be a useful reference for your own implementation:
1. Fundamental matrix estimation
2. Calibrated 5-point relative pose
3. Triangulation of multiple rays
When refining the realtive pose using LM, the rotation matrix R can be factorized as dR*R_init, where R_init is the initial guess which is fixed during iterations, and dR can be parameterized as a 3D vector ω using Rodrigues' rotation formula. This parameterization simplifies the computation of the jacobian matrix through the approximation:

where v=(v_x, v_y, v_z)^T is an arbitary 3D vector.

Multi-view Reconstruction

Three Point Perspective Pose Estimation

Incremental Structure from Motion

SBA: Sparse Bundle Adjustment

Assignments:

Extract the feature tracks among the circle sequence from the output file (*.act ) of ACTS generated at the begining of this training project, and use the focal length to calibrate all the 2D features.
For video sequence, adjacent frames provide redundant information, and the narrow base-line multi-view reconstruction has much more uncertainty in the depth direction, so that only the keyframes are needed. At this step, keyframes are extracted based on the following two criterions:
1. Between adjacent keyframes, the number of matches satisfying epipolar geometry must be above a threshold.
2. When the first criterion is satisfied, the interval between adjacent keyframes must be as large as possible.
Estimate the camera parameters for each keyframe and the 3D position for each feature track among keyframes, using Incremental Structure from Motion:
1. Select two keyframes as the initial pair, based on the number of matches satisfying epipolar geometry, and the homography ratio describing the degeneracy thus indicating the base-line width.
2. Perform the two-view reconstruction, followed by Bundle Adjustment.
3. Select other keyframes containing enough number of recovered scene points.
4. Estimate the camera pose C=(R, t) for newly selected keyframes, using Three Point Perspective Pose Estimation within RANSAC framework, followed by LM optimization performed on inliers to refine C:
5. Triangulate for each newly involved 3D point X, via the corresponding recovered C=(R, t) and calibrated 2D measurements x', using DLT followed by LM optimization:
6. Perform Bundle Adjustment on all the recovered keyframes and 3D points.
7. Repeat step iii~vi until all the keyframes are recovered.
8. Recover other frames using method describing in step iv, and perform Bundle Adjustment again.

Tips:

Bundler includes all the components of Incremental Structure from Motion, except that Bundler uses DLT followed by RQ decomposition to initialize camera pose.
Bundler uses the package-provided forward-difference method to approximate the jacobian matrix. In this training project, all the jacobian matrices involved in the LM and SBA procedure can be calculated analytically via the chain rule to increase the accuracy and efficiency.