We propose a novel algorithm for monocular depth estimation that decomposes a metric depth map into a normalized depth map and scale features. The proposed network is composed of a shared encoder and three decoders, called G-Net, N-Net, and M-Net, which estimate gradient maps, a normalized depth map, and a metric depth map, respectively. M-Net learns to estimate metric depths more accurately using relative depth features extracted by G-Net and N-Net. The proposed algorithm has the advantage that it can use datasets without metric depth labels to improve the performance of metric depth estimation. Experimental results on various datasets demonstrate that the proposed algorithm not only provides competitive performance to state-of-the-art algorithms but also yields acceptable results even when only a small amount of metric depth data is available for its training.
翻译:我们提议了一个用于单眼深度估计的新奇算法,该算法可以将一个测量深度图分解成一个标准化深度图和比例尺特征。拟议网络由共享的编码器和三个解码器组成,称为G-Net、N-Net和M-Net,分别估算梯度图、标准深度图和衡量深度图。M-Net学会使用G-Net和N-Net所提取的相对深度特征更准确地估算测量深度。拟议的算法的优点是,它可以使用没有测量深度标签的数据集来改进测量深度估计的性能。各种数据集的实验结果表明,拟议的算法不仅为最新算法提供了竞争性的性能,而且还产生了可接受的结果,即使只有少量的衡量深度数据可供培训使用。