The self-supervised loss formulation for jointly training depth and egomotion neural networks with monocular images is well studied and has demonstrated state-of-the-art accuracy. One of the main limitations of this approach, however, is that the depth and egomotion estimates are only determined up to an unknown scale. In this paper, we present a novel scale recovery loss that enforces consistency between a known camera height and the estimated camera height, generating metric (scaled) depth and egomotion predictions. We show that our proposed method is competitive with other scale recovery techniques that require more information. Further, we demonstrate that our method facilitates network retraining within new environments, whereas other scale-resolving approaches are incapable of doing so. Notably, our egomotion network is able to produce more accurate estimates than a similar method which recovers scale at test time only.
翻译:联合培训深度和自我感动神经网络的自我监督损失配方,配有单眼图像的自我监督损失配方,经过了深入研究,并展示了最先进的准确性。然而,这一方法的主要局限性之一是深度和自我感动估计只能确定到未知的规模。在本文中,我们提出了一种新的规模恢复损失,使已知的摄影机高度和估计的摄影机高度保持一致,产生(量度)深度和自我感动预测。我们表明,我们提出的方法与其他规模恢复技术相比具有竞争力,需要更多信息。此外,我们证明,我们的方法有利于在新环境中进行网络再培训,而其他规模解析方法则无法做到这一点。值得注意的是,我们的自我感动网络能够产生比仅在试验时间恢复规模的类似方法更准确的估计数。