边界诱发和现场汇总的单眼深度预测网络 (Boundary-induced and scene-aggregated network for monocular depth prediction)

Monocular depth prediction is an important task in scene understanding. It aims to predict the dense depth of a single RGB image. With the development of deep learning, the performance of this task has made great improvements. However, two issues remain unresolved: (1) The deep feature encodes the wrong farthest region in a scene, which leads to a distorted 3D structure of the predicted depth; (2) The low-level features are insufficient utilized, which makes it even harder to estimate the depth near the edge with sudden depth change. To tackle these two issues, we propose the Boundary-induced and Scene-aggregated network (BS-Net). In this network, the Depth Correlation Encoder (DCE) is first designed to obtain the contextual correlations between the regions in an image, and perceive the farthest region by considering the correlations. Meanwhile, the Bottom-Up Boundary Fusion (BUBF) module is designed to extract accurate boundary that indicates depth change. Finally, the Stripe Refinement module (SRM) is designed to refine the dense depth induced by the boundary cue, which improves the boundary accuracy of the predicted depth. Several experimental results on the NYUD v2 dataset and \xff{the iBims-1 dataset} illustrate the state-of-the-art performance of the proposed approach. And the SUN-RGBD dataset is employed to evaluate the generalization of our method. Code is available at https://github.com/XuefengBUPT/BS-Net.

翻译：在现场理解中,单色深度预测是一项重要任务。它旨在预测单一 RGB 图像的密集深度。随着深层学习的发展,这一任务的执行情况已经取得了很大的改进。然而,有两个问题仍未解决:(1) 深层特征将一个场景中最偏远的区域编码为错误的区域,从而导致预测深度的3D结构被扭曲;(2) 低层特征没有得到充分利用,从而更难在深度突变的情况下估计边缘附近的深度。为了解决这两个问题,我们建议使用边界诱导和Scene聚合网络(BS-Net)。在这个网络中,深度Correlation Netcoder(DCE)首先设计是为了获得图像中各区域之间的背景关联,并通过考虑这些关联来了解最远的区域。与此同时,下层边界布局(BUB)模块旨在提取显示深度变化的准确边界界限。Treache Refrimement 模块(SRM)旨在改进边界定位所引出的密集深度,从而提高预测深度的边界精确度。若干实验性结果显示SBS-BS-SD通用数据使用的方法。SBS-VS-SUDS-SD的通用数据和SUD-SD的通用方法是B-SUD-SUD-S-S-S-SD-SD-SUD-SD-SD-SD-SD-S-S-S-S-S-S-S-S-S-SD-S-S-S-S-S-S-S-S-S-SD-S-S-SD-SD-SD-SD-S-S-S-S-SD-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-