While self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches, violations of the static world assumption can still lead to erroneous depth predictions of traffic participants, posing a potential safety issue. In this paper, we present R4Dyn, a novel set of techniques to use cost-efficient radar data on top of a self-supervised depth estimation framework. In particular, we show how radar can be used during training as weak supervision signal, as well as an extra input to enhance the estimation robustness at inference time. Since automotive radars are readily available, this allows to collect training data from a variety of existing vehicles. Moreover, by filtering and expanding the signal to make it compatible with learning-based approaches, we address radar inherent issues, such as noise and sparsity. With R4Dyn we are able to overcome a major limitation of self-supervised depth estimation, i.e. the prediction of traffic participants. We substantially improve the estimation on dynamic objects, such as cars by 37% on the challenging nuScenes dataset, hence demonstrating that radar is a valuable additional sensor for monocular depth estimation in autonomous vehicles. Additionally, we plan on making the code publicly available.
翻译:虽然驾驶假设的自我监督的单心深度估计已经达到与监督方法的类似性能,但违反静态的世界假设仍可能导致对交通参与者的错误深度预测,从而形成潜在的安全问题。在本论文中,我们介绍了R4Dyn,这是一套在自监督的深度估计框架顶部使用具有成本效益的雷达数据的新颖技术。特别是,我们展示了在培训期间如何使用雷达作为薄弱的监督信号,以及增加投入,以提高在推论时间的估计可靠性。由于汽车雷达随时可用,因此能够从现有各种车辆收集培训数据。此外,通过过滤和扩大信号,使其与基于学习的方法兼容,我们处理雷达固有的问题,例如噪音和偏静性。通过R4Dyn,我们能够克服自我监督深度估计的重大限制,即对交通参与者的预测。我们大幅改进了对动态物体(如汽车)的估算,在具有挑战性的核星数据集上增加了37%,从而表明雷达是可公开估算单心深度的车辆的又一个有价值的传感器。