We address the problem of depth and ego-motion estimation from image sequences. Recent advances in the domain propose to train a deep learning model for both tasks using image reconstruction in a self-supervised manner. We revise the assumptions and the limitations of the current approaches and propose two improvements to boost the performance of the depth and ego-motion estimation. We first use Lie group properties to enforce the geometric consistency between images in the sequence and their reconstructions. We then propose a mechanism to pay an attention to image regions where the image reconstruction get corrupted. We show how to integrate the attention mechanism in the form of attention gates in the pipeline and use attention coefficients as a mask. We evaluate the new architecture on the KITTI datasets and compare it to the previous techniques. We show that our approach improves the state-of-the-art results for ego-motion estimation and achieve comparable results for depth estimation.
翻译:我们从图像序列中处理深度和自我感动估计问题。最近该领域的进展提议以自我监督的方式,利用图像重建,为这两项任务培养一个深层次学习模式。我们修改当前方法的假设和局限性,并提出两项改进建议,以提高深度和自我感动估计的性能。我们首先利用利小组的特性来强制在图像序列与图像重建之间实现几何一致性。然后我们提议一个机制,关注图像重建受到腐蚀的图像区域。我们展示了如何将关注机制以关注大门的形式纳入管道中,并使用关注系数作为掩码。我们评估了KITTI数据集上的新结构,并将其与先前的技术进行了比较。我们展示了我们的方法改进了自我感动估计的最新结果,并取得了深度估计的类似结果。