Funded Research Projects

High-speed Vision-based Motion Estimation (NSF IIS-1464420)

The motivation for this work is the basic insight that increasing the speed of motion estimation leads to a "virtuous cycle" of system improvement: by computing the motion estimate more quickly, the camera can be operated a higher rate; this in turns leads to less motion between successive frames, enabling faster computations. The speedups achieved in the current preliminary work rely on two techniques: approximation of the motion model; and reduction of fundamental geometric problem to a smaller form which is quicker to solve. By investigating related problems, we aim to determine an entire class of visual motion estimation problems that can be efficiently solved in a similar way.

REU Site: Machine Learning in Natural Language Processing and Computer Vision (NSF IIS-1659788)

This Research Experience for Undergraduates site seeks to increase the number of American citizens and permanent resident undergraduates who are attracted to careers in research and advanced studies in Computer Science. The project has a special focus on training future computer scientists from institutions with limited research opportunities, women and under-represented minorities. Undergraduate participants explore state-of-the-art topics in machine learning applied to computer vision and natural language processing during a 10-week intensive research program each summer at UCCS.

Tracking and Localization

SPLAT: Spherical Localization and Tracking in Large Spaces

We present an alternative SLAM method, which combines spherical Structure-from-Motion and a robust 3D tracking method. We compare our method to ORB SLAM2 in synthetic and real tests, and show that our method can track more reliably in large spaces, with simpler calculation due to the spherical motion constraint. We discuss this issue in the context of implementing an AR interface for live sport events in stadiums or other open environments, but possible application scenarios for our technique go beyond and can be applied to handheld AR in many outdoor environments.

Global Localization from Monocular SLAM on a Mobile Phone

We propose the combination of a keyframe-based monocular SLAM system and a global localization method. The SLAM system runs locally on a camera-equipped mobile client and provides continuous, relative 6DoF pose estimation as well as keyframe images with computed camera locations. As the local map expands, a server process localizes the keyframes with a pre-made, globally-registered map and returns the global registration correction to the mobile client.

A Minimal Solution to the Generalized Pose-and-Scale Problem

We propose a novel solution to the generalized camera pose problem which includes the internal scale of the generalized camera as an unknown parameter. This further generalization of the well-known absolute camera pose problem has applications in multi-frame loop closure.

Live Tracking and Mapping from Both General and Rotation-Only Camera Motion

We present an approach to real-time tracking and mapping that supports any type of camera motion in 3D environments, that is, general (parallax-inducing) as well as rotation-only (degenerate) motions. Our approach effectively generalizes both a panorama mapping and tracking system and a keyframe-based Simultaneous Localization and Mapping (SLAM) system, behaving like one or the other depending on the camera movement.

Wide-Area Scene Mapping for Mobile Visual Tracking

We propose a system for easily preparing arbitrary wide-area environments for subsequent real-time tracking with a handheld device. Our system evaluation shows that minimal user effort is required to initialize a camera tracking session in an unprepared environment. In contrast to camera-based simultaneous localization and mapping (SLAM) systems, our methods are suitable for handheld use in large outdoor spaces.

Fast and Scalable Keypoint Recognition and Image Retrieval Using Binary Codes

We evaluated keypoint descriptor compression using as little as 16 bits to describe a single keypoint. By indexing the keypoints in a binary tree, we can quickly recognize keypoints with a very compact database, and efficiently insert new keypoints.

Visual Modeling

CasualStereo: Casual Capture of Stereo Panoramas with Spherical Structure-from-Motion

We evaluate the use of spherical structure-from-motion for reconstructing handheld stereo panorama captures. The spherical motion constraint mitigates the small-baseline problem, making it well-suited to the use case of stereo panorama capture with a handheld camera. We demonstrate the effectiveness of spherical structure-from-motion for casual capture of high-resolution stereo panoramas and validate our results with a user study.

Structure from Motion on a Sphere

We describe a special case of structure from motion where the camera rotates on a sphere. The camera's optical axis lies perpendicular to the sphere's surface. In this case, the camera's pose is minimally represented by three rotation parameters. From analysis of the epipolar geometry we derive a novel and efficient solution for the essential matrix relating two images, requiring only three point correspondences in the minimal case.

Geospatial Management and Utilization of Large-Scale Urban Visual Reconstructions

We describe our approach to efficiently create, handle and organize large-scale Structure-from-Motion reconstructions of urban environments. We store sparse point cloud reconstructions from an omnidirectional camera and differential GPS in a geospatial database and incorporate additional data from multiple crowd-sourced databases, such as maps and images from social media.

Structure from Motion in Urban Environments Using Upright Panoramas

We describe and evaluate a reconstruction pipeline for upright panoramas taken in an urban environment. Panoramas can be aligned to a common vertical orientation using vertical vanishing point detection or orientation sensors. We introduce a pose estimation algorithm which uses knowledge of a common vertical orientation as a simplifying constraint.

Online Environment Model Estimation for Augmented Reality

We introduce a system which constructs a textured geometric model of the user’s environment as it is being explored. Image patches in keyframes are assigned to planes in the scene using stereo analysis. This environment model can be rendered into new frames to aid in several common but difficult AR tasks such as accurate real-virtual occlusion and annotation placement.

Fast Annotation and Modeling with a Single-Point Laser Range Finder

We integrate a small, single-point laser range finder into a wearable augmented reality system. We first present a way of creating object-aligned annotations with very little user effort. Second, we describe techniques to segment and pop-up foreground objects. Finally, we introduce a method using the laser range finder to incrementally build 3D panoramas from a fixed observer’s location.

User Interfaces and Graphics

The City of Sights: Designing an Augmented Reality Stage Set

We design and implement a physical and virtual model of an imaginary urban scene — the “City of Sights” — that can serve as a backdrop or “stage” for Augmented Reality (AR) research.

Evaluating the Effects of Tracker Reliability and Field of View on a Target Following Task in Augmented Reality

We examine the effect of varying levels of immersion on the performance of a target following task in augmented reality (AR) X-ray vision. Our study gives insight into the effect of tracking sensor reliability and field of view on user performance.

A Sketch-based Interface for Photo Pop-up

We present sketch-based tools for single-view modeling which allow for quick 3D mark-up of a photograph. Our methods produce good 3D results in a short amount of time and with little user effort, demonstrating the usefulness of an intelligent sketching interface for this application domain.

Depth Compositing for Augmented Reality

We developed a method for automatic depth compositing which uses a stereo camera, without assuming static camera pose or constant illumination. We extend the Layered Graph Cut to general depth compositing by decoupling the color and depth distributions, so that the depth distribution is determined by the disparity map of the virtual scene to be composited in.