A typical monocular depth estimator is trained for a single camera, so its performance drops severely on images taken with different cameras. To address this issue, we propose a versatile depth estimator (VDE), composed of a common relative depth estimator (CRDE) and multiple relative-to-metric converters (R2MCs). The CRDE extracts relative depth information, and each R2MC converts the relative information to predict metric depths for a specific camera. The proposed VDE can cope with diverse scenes, including both indoor and outdoor scenes, with only a 1.12\% parameter increase per camera. Experimental results demonstrate that VDE supports multiple cameras effectively and efficiently and also achieves state-of-the-art performance in the conventional single-camera scenario.
翻译:典型的单目深度估计器仅针对单个相机进行训练,因此其性能在使用不同相机拍摄的图像上严重下降。为了解决这个问题,我们提出了一种灵活的深度估计器 (VDE),由一个通用相对深度估计器 (CRDE) 和多个相对于度量深度转换器 (R2MCs) 组成。CRDE 提取相对深度信息,每个 R2MC 将相对信息转换为预测特定相机的度量深度。所提出的 VDE 可应对各种场景,包括室内外场景,每相机只增加 1.12% 的参数。实验结果表明,VDE 可有效且高效地支持多个相机,同时在传统的单相机情景下也实现了最先进的性能。