While radar and video data can be readily fused at the detection level, fusing them at the pixel level is potentially more beneficial. This is also more challenging in part due to the sparsity of radar, but also because automotive radar beams are much wider than a typical pixel combined with a large baseline between camera and radar, which results in poor association between radar pixels and color pixel. A consequence is that depth completion methods designed for LiDAR and video fare poorly for radar and video. Here we propose a radar-to-pixel association stage which learns a mapping from radar returns to pixels. This mapping also serves to densify radar returns. Using this as a first stage, followed by a more traditional depth completion method, we are able to achieve image-guided depth completion with radar and video. We demonstrate performance superior to camera and radar alone on the nuScenes dataset. Our source code is available at https://github.com/longyunf/rc-pda.
翻译:虽然雷达和视频数据可以在探测水平上很容易地结合,但将雷达和视频数据固定在像素水平上可能更有好处,这在部分程度上也更具挑战性,因为雷达的广度,但也因为汽车雷达束比典型的像素要宽得多,加上摄像头和雷达之间的大型基线,导致雷达像素和彩色像素之间联系差。结果之一是,为LIDAR设计的深度完成方法以及用于雷达和视频的视频票价差。我们在这里提议了一个雷达到像素联系阶段,从雷达返回到像素,学习雷达图解。这种绘图还有助于使雷达返回密度化。我们利用这一第一阶段,然后采用更传统的深度完成方法,能够用雷达和视频实现图像引导深度完成。我们展示了在核Scenes数据集上仅摄像和雷达的性优异性。我们的源代码可在https://github.com/longyunf/rc-pda上查阅。