Depth estimation is a core task in 3D computer vision. Recent methods investigate the task of monocular depth trained with various depth sensor modalities. Every sensor has its advantages and drawbacks caused by the nature of estimates. In the literature, mostly mean average error of the depth is investigated and sensor capabilities are typically not discussed. Especially indoor environments, however, pose challenges for some devices. Textureless regions pose challenges for structure from motion, reflective materials are problematic for active sensing, and distances for translucent material are intricate to measure with existing sensors. This paper proposes HAMMER, a dataset comprising depth estimates from multiple commonly used sensors for indoor depth estimation, namely ToF, stereo, structured light together with monocular RGB+P data. We construct highly reliable ground truth depth maps with the help of 3D scanners and aligned renderings. A popular depth estimators is trained on this data and typical depth senosors. The estimates are extensively analyze on different scene structures. We notice generalization issues arising from various sensor technologies in household environments with challenging but everyday scene content. HAMMER, which we make publicly available, provides a reliable base to pave the way to targeted depth improvements and sensor fusion approaches.
翻译:深度估算是3D计算机视野的一项核心任务。最近的方法调查了以各种深度传感器方式训练的单眼深度任务。每个传感器都有其优点和缺点。在文献中,对深度平均差大多是平均的,对传感器能力通常不进行讨论。然而,特别是室内环境,对一些设备构成挑战。无纹理区域对运动结构构成挑战,反射材料对主动感测有问题,而中转材料的距离与现有传感器测量是复杂的。本文提议HAMMEAR,这是一个数据集,由用于室内深度估计的多种常用传感器(即ToF、立体、结构光和单眼RGB+P数据)的深度估算组成。我们在3D扫描器和校准图像的帮助下,绘制了非常可靠的地面真实深度图。对这些数据和典型深度测深器进行了培训。对不同场结构进行了广泛分析。我们注意到在家庭环境中具有挑战性但具有日常场景内容的各种传感器技术产生的一般化问题。我们公开提供的HAMMEAR,为铺设目标深度改进和传感器的方法提供了可靠的基础。