Unsupervised monocular depth and ego-motion estimation has drawn extensive research attention in recent years. Although current methods have reached a high up-to-scale accuracy, they usually fail to learn the true scale metric due to the inherent scale ambiguity from training with monocular sequences. In this work, we tackle this problem and propose DynaDepth, a novel scale-aware framework that integrates information from vision and IMU motion dynamics. Specifically, we first propose an IMU photometric loss and a cross-sensor photometric consistency loss to provide dense supervision and absolute scales. To fully exploit the complementary information from both sensors, we further drive a differentiable camera-centric extended Kalman filter (EKF) to update the IMU preintegrated motions when observing visual measurements. In addition, the EKF formulation enables learning an ego-motion uncertainty measure, which is non-trivial for unsupervised methods. By leveraging IMU during training, DynaDepth not only learns an absolute scale, but also provides a better generalization ability and robustness against vision degradation such as illumination change and moving objects. We validate the effectiveness of DynaDepth by conducting extensive experiments and simulations on the KITTI and Make3D datasets.
翻译:近些年来,无人监督的单视深度和自我感官估计引起了广泛的研究关注。虽然目前的方法已经达到高度到规模的准确性,但由于单视序列培训的内在规模模糊性,它们通常无法了解真正的尺度度量。在这项工作中,我们处理这一问题并提出Dyna Depth,这是一个将视觉和IMU运动动态中的信息整合起来的新型规模认知框架。具体地说,我们首先提议采用IMU光度损失和跨传感器光度一致性损失,以提供密集的监督和绝对尺度。为充分利用两个传感器的补充信息,我们进一步推动一个以摄影机为中心的卡曼扩展过滤器(EKF)在观察视觉测量时更新IMU的预集式动作。此外,EKF的配方能够学习一种自我感知度的不确定性测量方法,对于不受监督的方法来说并不起作用。通过在培训期间利用IMU,DinaDepeh不仅学习绝对的尺度,而且还提供更好的概括能力和稳健能力,以对付像变换和移动的物体等广泛视觉退化。我们通过进行模拟和数据来验证DKDPIS3的效能。