Conventional self-supervised monocular depth prediction methods are based on a static environment assumption, which leads to accuracy degradation in dynamic scenes due to the mismatch and occlusion problems introduced by object motions. Existing dynamic-object-focused methods only partially solved the mismatch problem at the training loss level. In this paper, we accordingly propose a novel multi-frame monocular depth prediction method to solve these problems at both the prediction and supervision loss levels. Our method, called DynamicDepth, is a new framework trained via a self-supervised cycle consistent learning scheme. A Dynamic Object Motion Disentanglement (DOMD) module is proposed to disentangle object motions to solve the mismatch problem. Moreover, novel occlusion-aware Cost Volume and Re-projection Loss are designed to alleviate the occlusion effects of object motions. Extensive analyses and experiments on the Cityscapes and KITTI datasets show that our method significantly outperforms the state-of-the-art monocular depth prediction methods, especially in the areas of dynamic objects. Code is available at https://github.com/AutoAILab/DynamicDepth
翻译:常规自我监督单眼深度预测方法基于静态环境假设,这种假设导致动态场景中由于物体运动带来的不匹配和隔离问题而出现精确性退化。现有的动态物体聚焦方法只部分解决了培训损失水平不匹配的问题。在本文中,我们相应地提出了一个新的多框架单眼深度预测方法,以在预测和监督损失水平上解决这些问题。我们的方法称为动态大司,这是一个通过自我监督周期一致的学习计划培训的新框架。一个动态物体运动分离模块(DOMD)旨在分离物体动作以解决不匹配问题。此外,新颖的隐性天体观测成本量和再预测损失旨在减轻物体运动的隔离效应。对市景区和KITTI数据集的广泛分析和实验表明,我们的方法大大超越了最先进的单眼深度预测方法,特别是在动态物体领域。代码可在 https://github.com/AutoAILAB/DynADustomic 上查阅。