We present a generic framework for scale-aware direct monocular odometry based on depth prediction from a deep neural network. In contrast with previous methods where depth information is only partially exploited, we formulate a novel depth prediction residual which allows us to incorporate multi-view depth information. In addition, we propose to use a truncated robust cost function which prevents considering inconsistent depth estimations. The photometric and depth-prediction measurements are integrated into a tightly-coupled optimization leading to a scale-aware monocular system which does not accumulate scale drift. Our proposal does not particularize for a concrete neural network, being able to work along with the vast majority of the existing depth prediction solutions. We demonstrate the validity and generality of our proposal evaluating it on the KITTI odometry dataset, using two publicly available neural networks and comparing it with similar approaches and the state-of-the-art for monocular and stereo SLAM. Experiments show that our proposal largely outperforms classic monocular SLAM, being 5 to 9 times more precise, beating similar approaches and having an accuracy which is closer to that of stereo systems.
翻译:我们根据深神经网络的深度预测,提出了一个通用框架,用于测深直接单眼镜测量; 与以前只部分利用深度信息的方法不同,我们制定了一个新的深度预测剩余方法,使我们能够纳入多视图深度信息; 此外,我们提议使用一个截断的稳健成本功能,防止考虑不一致的深度估计; 光度测量和深度测深测量结果被整合到一个紧密结合的优化中,导致一个规模测深单眼系统,不积累规模漂移; 我们的建议没有具体指定一个具体神经网络,能够与现有的绝大多数深度预测解决方案合作; 我们用两个公开的神经网络,将我们关于KITTI的测深数据集的评估建议的有效性和一般性,并将它与类似的方法和单望远镜和立式SLAM系统的现状进行比较。 实验表明,我们的建议基本上不符合典型的单眼系统,是5至9倍的精确度,打中类似方法,而且准确性接近立体系统。