Unsupervised learning of depth from indoor monocular videos is challenging as the artificial environment contains many textureless regions. Fortunately, the indoor scenes are full of specific structures, such as planes and lines, which should help guide unsupervised depth learning. This paper proposes PLNet that leverages the plane and line priors to enhance the depth estimation. We first represent the scene geometry using local planar coefficients and impose the smoothness constraint on the representation. Moreover, we enforce the planar and linear consistency by randomly selecting some sets of points that are probably coplanar or collinear to construct simple and effective consistency losses. To verify the proposed method's effectiveness, we further propose to evaluate the flatness and straightness of the predicted point cloud on the reliable planar and linear regions. The regularity of these regions indicates quality indoor reconstruction. Experiments on NYU Depth V2 and ScanNet show that PLNet outperforms existing methods. The code is available at \url{https://github.com/HalleyJiang/PLNet}.
翻译:不受监督地从室内单眼视频中了解深度是困难的,因为人造环境包含许多没有纹理的区域。幸运的是,室内场景充满了特定的结构,例如飞机和线条,这些结构应该有助于引导不受监督的深度学习。本文建议PLNet利用平面和线前线来提高深度估计。我们首先使用本地平面系数代表现场几何,并对代表面施加平滑的限制。此外,我们通过随机选择一些可能是共平或相线的点来测量简单有效的一致性损失,强制执行平面和线性一致性。为核实拟议方法的有效性,我们进一步建议评估可靠平面和线性区域预测点云的平坦性和直率。这些地区的规律性表明高质量的室内重建。对NYU深度V2和扫描网的实验显示,PLNet超越了现有方法。代码可在以下https://github.com/HalleyJiang/PLNet查阅。