A Minimal Solution to the Generalized Pose-and-Scale Problem

Jonathan Ventura1, Clemens Arth1, Gerhard Reitmayr2 and Dieter Schmalstieg1

1Graz University of Technology

2Qualcomm Austria Research Center


We propose a novel solution to the generalized camera pose problem which includes the internal scale of the generalized camera as an unknown parameter. This further generalization of the well-known absolute camera pose problem has applications in multi-frame loop closure. While a well-calibrated camera rig has a fixed and known scale, camera trajectories produced by monocular motion estimation necessarily lack a scale estimate. Thus, when performing loop closure in monocular visual odometry, or regis- tering separate structure-from-motion reconstructions, we must estimate a seven degree-of-freedom similarity transform from corresponding observations.

Existing approaches solve this problem, in specialized configurations, by aligning 3D triangulated points or individual camera pose estimates. Our approach handles gen- eral configurations of rays and points and directly estimates the full similarity transformation from the 2D-3D correspondences. Four correspondences are needed in the minimal case, which has eight possible solutions. The minimal solver can be used in a hypothesize-and-test architecture for robust transformation estimation. Our solver also produces a least-squares estimate in the overdetermined case.

The approach is evaluated experimentally on synthetic and real datasets, and is shown to produce higher accuracy solutions to multi-frame loop closure than existing approaches.


Ventura, J., C. Arth, G. Reitmayr, and D. Schmalstieg, A Minimal Solution to the Generalized Pose-and-Scale Problem, Computer Vision and Pattern Recognition (CVPR), 2014.


A MATLAB implementation of the algorithm is available under BSD license at GitHub.


The structure-from-motion alignment dataset from the paper is available here under a Creative Commons Attribution license.