Scale ambiguity is a fundamental problem in monocular visual odometry. Typical solutions include loop closure detection and environment information mining. For applications like self-driving cars, loop closure is not always available, hence mining prior knowledge from the environment becomes a more promising approach. In this paper, with the assumption of a constant height of the camera above the ground, we develop a light-weight scale recovery framework leveraging an accurate and robust estimation of the ground plane. The framework includes a ground point extraction algorithm for selecting high-quality points on the ground plane, and a ground point aggregation algorithm for joining the extracted ground points in a local sliding window. Based on the aggregated data, the scale is finally recovered by solving a least-squares problem using a RANSAC-based optimizer. Sufficient data and robust optimizer enable a highly accurate scale recovery. Experiments on the KITTI dataset show that the proposed framework can achieve state-of-the-art accuracy in terms of translation errors, while maintaining competitive performance on the rotation error. Due to the light-weight design, our framework also demonstrates a high frequency of 20Hz on the dataset.
翻译:尺度的模糊性是单表面视觉测量中的一个基本问题。 典型的解决方案包括循环封闭探测和环境信息采矿。 对于自驾驶汽车等应用,环封闭并不总是可用, 因而从环境中开采先前的知识会成为一个更有希望的方法。 在本文中, 假设摄像头在地面上保持恒定高度, 我们开发了一个轻量级的回收框架, 利用对地面平面的准确和稳健估计, 利用对地面平面进行精确和稳健的估算。 框架包括用于选择地面平面高质量点的地面点提取算法, 以及加入本地滑动窗口中抽取的地面点的地面点的地面点的地面点汇总算法。 根据汇总数据, 最终通过使用以RANSAC为基础的优化器解决最不平方的问题来恢复比例。 足够的数据和强力优化使高度精确的回收成为了。 在 KITTI 数据集上进行的实验表明, 拟议的框架可以在翻译错误方面达到最先进的精确度, 同时保持在旋转错误上的竞争性性能。 由于轻度设计, 我们的框架也显示在数据设置上显示20赫的高频率。