Depth estimation, as a necessary clue to convert 2D images into the 3D space, has been applied in many machine vision areas. However, to achieve an entire surrounding 360-degree geometric sensing, traditional stereo matching algorithms for depth estimation are limited due to large noise, low accuracy, and strict requirements for multi-camera calibration. In this work, for a unified surrounding perception, we introduce panoramic images to obtain larger field of view. We extend PADENet first appeared in our previous conference work for outdoor scene understanding, to perform panoramic monocular depth estimation with a focus for indoor scenes. At the same time, we improve the training process of the neural network adapted to the characteristics of panoramic images. In addition, we fuse traditional stereo matching algorithm with deep learning methods and further improve the accuracy of depth predictions. With a comprehensive variety of experiments, this research demonstrates the effectiveness of our schemes aiming for indoor scene perception.
翻译:深度估算,作为将 2D 图像转换为 3D 空间的必要线索,已应用于许多机器视觉区域。然而,为了实现整个周围360 度的几何测测,由于噪音大、精确度低和对多相机校准的严格要求,传统的立体相匹配算法有限。在这项工作中,为了统一周围的感知,我们引入了全景图像,以获得更大的视野。我们推广了PADENet,这是我们前几次会议工作中首次出现的户外场景理解,进行全景单色深度估测,重点是室内场景。与此同时,我们改进了适应全景图像特征的神经网络的培训过程。此外,我们还将传统立体相匹配算法与深层学习方法结合起来,进一步提高深度预测的准确性。通过全面的各种实验,这项研究展示了我们旨在实现室内场感测的计划的有效性。