The self-supervised loss formulation for jointly training depth and egomotion neural networks with monocular images is well studied and has demonstrated state-of-the-art accuracy. One of the main limitations of this approach, however, is that the depth and egomotion estimates are only determined up to an unknown scale. In this paper, we present a novel scale recovery loss that enforces consistency between a known camera height and the estimated camera height, generating metric (scaled) depth and egomotion predictions. We show that our proposed method is competitive with other scale recovery techniques that have more information available. Further, we demonstrate how our method facilitates network retraining within new environments, whereas other scale-resolving approaches are incapable of doing so. Notably, our egomotion network is able to produce more accurate estimates than a similar method that only recovers scale at test time.
翻译:联合培训深度和自我感动神经网络的自我监督损失配方,配有单眼图像的自我监督损失配方,经过了很好的研究,并展示了最先进的准确性。然而,这一方法的主要局限性之一是深度和自我感动估计只能确定到一个未知的规模。在本文中,我们提出了一种新的规模恢复损失,使已知的摄像高度和估计的摄像身高之间具有一致性,产生(量度)深度和自我感动预测。我们表明,我们提出的方法与其他规模恢复技术相比具有竞争力,而其他规模恢复技术拥有更多的信息。此外,我们展示了我们的方法如何促进在新环境中的网络再培训,而其他规模消化方法却无法做到这一点。值得注意的是,我们的自我感动网络能够产生比在测试时间只能恢复规模的类似方法更准确的估计数。