Self-supervised monocular depth estimation approaches suffer not only from scale ambiguity but also infer temporally inconsistent depth maps w.r.t. scale. While disambiguating scale during training is not possible without some kind of ground truth supervision, having scale consistent depth predictions would make it possible to calculate scale once during inference as a post-processing step and use it over-time. With this as a goal, a set of temporal consistency losses that minimize pose inconsistencies over time are introduced. Evaluations show that introducing these constraints not only reduces depth inconsistencies but also improves the baseline performance of depth and ego-motion prediction.
翻译:自监督单目深度估计方法不仅存在尺度模糊, 还会推断与尺度不一致的时间上不一致的深度图. 虽然训练期间无法消除尺度歧义而无需一定的地面真实监督, 但深度预测具有尺度一致性则可以在推断期间一次用后处理步骤计算尺度, 并随时间使用. 为此, 引入了一组时间一致性损失, 以最小化时间上的姿态不一致性. 评估表明, 引入这些约束不仅降低了深度的不一致性, 而且还提高了深度和自我运动预测的基线性能.