Image-based modeling of urban environments is a key component of enabling outdoor, vision-based augmented reality applications. The images used for modeling may come from offline efforts, or online user contributions. Panoramas have been used extensively in mapping cities, and can be captured quickly by an end-user with a mobile phone. In this paper, we describe and evaluate a reconstruction pipeline for upright panoramas taken in an urban environment. We first describe how panoramas can be aligned to a common vertical orientation using vertical vanishing point detection, which we show to be robust for a range of inputs. The orientation sensors in modern cameras can also be used to correct the vertical orientation. Secondly, we introduce a pose estimation algorithm which uses knowledge of a common vertical orientation as a simplifying constraint. This procedure is shown to reduce pose estimation error in comparison to the state of the art. Finally, we evaluate our reconstruction pipeline with several real-world examples.

Ventura, J., and T. Höllerer, "Structure from Motion in Urban Environments Using Upright Panoramas", Virtual Reality, vol. 17, issue 2, pp. 147-156, 05/2013.