Abstract
Estimating camera motion from monocular video is a core challenge in computer vision, underpinning applications like SLAM, visual odometry, and structure-from-motion. Numerous methods have been proposed to determine the camera’s heading when its rotation is known either from an IMU or an optimization algorithm. While these approaches perform well in low-noise, low-outlier conditions, they often fail or become computationally expensive as noise and outlier levels increase. To address these limitations, we introduce a novel approach that employs a generalized Hough transform on the unit sphere, $\mathcal{S}^2$, to estimate the camera translation directions. We start by extracting correspondences between two frames and generating a great circle of directions compatible with each pair of correspondences. By discretizing the unit sphere using a Fibonacci Lattice as bin centers, each great circle casts votes for a range of directions, ensuring that features unaffected by noise or dynamic objects vote consistently for the correct motion direction. Experimental results on three datasets demonstrate that our approach exceeds baseline methods in both accuracy and speed.
FLIGHT
Overview

Estimating camera translation direction from monocular video is a core problem in visual odometry, SLAM, and structure-from-motion. When camera rotation is known or independently estimated, translation direction can be recovered from correspondences alone. However, existing methods often suffer from one of two problems:
- They degrade significantly under noise and outliers.
- They become computationally expensive as robustness increases.
FLIGHT addresses both. The key idea is to generalize the Hough transform onto the unit sphere. Instead of generating hypotheses from pairs of correspondences, each correspondence defines a great circle of compatible translation directions on ( $S^2 $). We discretize the sphere using a Fibonacci lattice and allow each great circle to vote for bins proportional to arc-length intersection.
Method Summary
FLIGHT consists of four main components:
Great-Circle Representation
Each rotation-compensated correspondence defines a plane through the origin. Its intersection with the unit sphere forms a great circle of possible translation directions.
Fibonacci Discretization
We sample $\mathcal{S}^2$ using a Fibonacci lattice to achieve near-uniform bin spacing. Each bin represents a candidate motion direction.
Arc-Length Voting
Each great circle votes for bins it intersects. The vote weight is proportional to arc-length intersection, not a binary vote.
Hierarchical Refinement
Sparse coarse localization followed by dense local refinement, non-linear eigen refinement, and early stopping.
Final complexity: $\mathcal{O}(n m)$ where $n$ is number of correspondences and $m$ number of bins.
Quantitative Results
| Method | mAA 2°↑ | mAA 5°↑ | Time (ms)↓ |
|---|---|---|---|
| FLIGHT | 0.6193 | 0.8223 | 0.9472 |
| FOE | 0.6065 | 0.8169 | 3.8278 |
| BNB | 0.6013 | 0.81 | 35.7238 |
| 2-Points | 0.6141 | 0.8177 | 12.7976 |
| MAGSAC++ | 0.6105 | 0.818 | 2.8462 |
FLIGHT improves accuracy over classical baselines while running up to 66 percent faster.
| Method | mAA 5°↑ | mAA 10°↑ | Time (ms)↓ |
|---|---|---|---|
| FLIGHT | 0.2731 | 0.4781 | 0.9149 |
| FOE | 0.2715 | 0.4795 | 43.4998 |
| BNB | 0.1578 | 0.2988 | 29.4898 |
| 2-Points | 0.2794 | 0.4819 | 11.4159 |
| MAGSAC++ | 0.2778 | 0.4827 | 3.9321 |
FLIGHT has comparable accuracy to classical baselines while running up to 76 percent faster on mAA 10°.
| Method | mAA 5°↑ | mAA 10°↑ | Time (ms)↓ |
|---|---|---|---|
| FLIGHT | 0.3591 | 0.4541 | 1.4428 |
| FOE | 0.3482 | 0.4429 | 132.2228 |
| BNB | 0.1728 | 0.2834 | 233.4138 |
| 2-Points | 0.291 | 0.385 | 115.2381 |
| MAGSAC++ | 0.2815 | 0.3717 | 2.9324 |
FLIGHT remains stable under moving objects and high outlier rates 50 percent faster
Video
Key Advantages
Deterministic
Voting-based approach eliminates randomness inherent in RANSAC-style sampling.
Real-Time
Sub-millisecond inference enables deployment in latency-sensitive pipelines.
Outlier Robust
Stable runtime and accuracy even under 80% outlier ratios in dynamic scenes.
Drop-In
Simple geometric formulation enables easy integration into existing SLAM systems.
Citation
@misc{dirnfeld2026flight,
title={FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time},
author={David Dirnfeld and Fabien Delattre and Pedro Miraldo and Erik Learned-Miller},
year={2026},
eprint={2602.23115},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.23115},
}