室内未通文旋转场景的自上巡视单望远镜深度估计 (Self-Supervised Monocular Depth Estimation of Untextured Indoor Rotated Scenes)

Self-supervised deep learning methods have leveraged stereo images for training monocular depth estimation. Although these methods show strong results on outdoor datasets such as KITTI, they do not match performance of supervised methods on indoor environments with camera rotation. Indoor, rotated scenes are common for less constrained applications and pose problems for two reasons: abundance of low texture regions and increased complexity of depth cues for images under rotation. In an effort to extend self-supervised learning to more generalised environments we propose two additions. First, we propose a novel Filled Disparity Loss term that corrects for ambiguity of image reconstruction error loss in textureless regions. Specifically, we interpolate disparity in untextured regions, using the estimated disparity from surrounding textured areas, and use L1 loss to correct the original estimation. Our experiments show that depth estimation is substantially improved on low-texture scenes, without any loss on textured scenes, when compared to Monodepth by Godard et al. Secondly, we show that training with an application's representative rotations, in both pitch and roll, is sufficient to significantly improve performance over the entire range of expected rotation. We demonstrate that depth estimation is successfully generalised as performance is not lost when evaluated on test sets with no camera rotation. Together these developments enable a broader use of self-supervised learning of monocular depth estimation for complex environments.

翻译：自我监督的深层次学习方法利用了立体图像来培训单层深度评估。虽然这些方法在KITTI等户外数据集中显示出了强有力的效果, 但它们并不匹配室内环境监督方法的性能与摄像旋转。在室内, 旋转的场景对于限制较少的应用十分常见, 并且由于以下两个原因造成问题: 大量的低质区域以及旋转图像深度提示的复杂程度更高。为了努力将自我监督的学习推广到更普遍的环境下, 我们建议增加两个内容。首先, 我们提出一个新的填充式差异术语, 纠正无纹度区域图像重建错误损失的模糊性。具体地说, 我们利用周围纹度区域的估计差异, 并使用L1 损失来纠正最初的估计。我们的实验表明, 在低质场景中, 深度的深度估计大大改进了深度, 与 Godard 等人等人提出的单面深度比较, 我们提议增加两个新提出的“ 填充式差异” 术语, 纠正了无纹度区域图像重建错误损失的模样。具体来说, 我们利用无色调的无色区域间差异区域之间的差异差异差异, 将大大改进了无甚深层的性表现,, 。我们无法在全层学习深度的深度测试中, 。