Consistent Depth Maps Recovery

Guofeng Zhang1             Jiaya Jia2             Tien-Tsin Wong2             Hujun Bao1
1State Key Lab of CAD&CG, Zhejiang University                        2The Chinese University of Hong Kong


Abstract-This paper presents a novel method for recovering consistent depth maps from a video sequence. We propose a bundle optimization framework to address the major difficulties in stereo reconstruction, such as dealing with image noise, occlusions, and outliers. Different from the typical multi-view stereo methods, our approach not only imposes the photo-consistency constraint, but also explicitly associates the geometric coherence with multiple frames in a statistical way. It thus can naturally maintain the temporal coherence of the recovered dense depth maps without over-smoothing. To make the inference tractable, we introduce an iterative optimization scheme by first initializing the disparity maps using a segmentation prior and then refining the disparities by means of bundle optimization. Instead of defining the visibility parameters, our method implicitly models the reconstruction noise as well as the probabilistic visibility. After bundle optimization, we introduce an efficient space-time fusion algorithm to further reduce the reconstruction noise. Our automatic depth recovery is evaluated using a variety of challenging video examples.


Recovering Consistent Video Depth Maps via Bundle Optimization
Guofeng Zhang, Jiaya Jia, Tien-Tsin Wong and Hujun Bao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008. [paper][talk slides]

Consistent Depth Maps Recovery from a Video Sequence
Guofeng Zhang, Jiaya Jia, Tien-Tsin Wong and Hujun Bao
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 31(6):974-988, 2009.[pdf]


author = {Guofeng Zhang and Jiaya Jia and Tien-Tsin Wong and Hujun Bao},
title = {Consistent Depth Maps Recovery from a Video Sequence},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume = {31},
number = {6},
pages = {974-988},
year = {2009}


Further Improvement with Depth-Level Expansion:

For traditional belief propagation, the computational complexity is linear to the number of labels. Hence, accurate depth estimation using a large number of levels implies large memory consumption and is very time-consuming. We have proposed a depth-level expansion method to dramatically densify the levels of depth without introducing much computational overhead. Please refer to Refilming with Depth-Inferred Videos for more details.



Bundle optimization illustration

After bundle optimization, the recovered dense depths are not only temporally consistent but also very accurate even around the discontinuous object boundaries.

Road sequence


The initial result without using segmentation


Segmentation prior


Initialization result using segmentation

Our final result after bundle optimization

Magnified views


More results


Fountian-P11 example


Result Movie

Bundle Optimization Illustration