Global Structure-from-Motion by Similarity Averaging



Global structure-from-motion (SfM) methods solve all cameras simultaneously from all available relative motions. It has better potential in both reconstruction accuracy and computation efficiency than incremental methods. However, global SfM is challenging, mainly because of two reasons. Firstly, translation averaging is difficult, since an essential matrix only tells the direction of relative translation. Secondly, it is also hard to filter out bad essential matrices due to feature matching failures. We propose to compute a sparse depth image at each camera to solve both problems. Depth images help to upgrade an essential matrix to a similarity transformation, which can determine the scale of relative translation. Depth images also make the filtering of essential matrices simple and effective. In this way, translation averaging can be solved robustly in two convex L1 optimization problems, which reach the global optimum rapidly. We demonstrate this method in various examples including sequential data, Internet data, and ambiguous data with repetitive scene structures.

Online Video



"Global Structure-from-Motion by Similarity Averaging", Zhaopeng Cui and Ping Tan. IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec. 2015.