In this work, we present an uncertainty-based method for sensor fusion with camera and radar data. The outputs of two neural networks, one processing camera and the other one radar data, are combined in an uncertainty aware manner. To this end, we gather the outputs and corresponding meta information for both networks. For each predicted object, the gathered information is post-processed by a gradient boosting method to produce a joint prediction of both networks. In our experiments we combine the YOLOv3 object detection network with a customized $1D$ radar segmentation network and evaluate our method on the nuScenes dataset. In particular we focus on night scenes, where the capability of object detection networks based on camera data is potentially handicapped. Our experiments show, that this approach of uncertainty aware fusion, which is also of very modular nature, significantly gains performance compared to single sensor baselines and is in range of specifically tailored deep learning based fusion approaches.
翻译:在这项工作中,我们提出了一种基于不确定性的传感器与相机和雷达数据融合方法。两个神经网络(一个处理相机和另一个雷达数据)的输出是以一种对不确定性有认识的方式结合的。为此目的,我们为两个网络收集输出和相应的元信息。对于每个预测对象,收集的信息都是用梯度加速法处理后产生的对两个网络的联合预测。在我们的实验中,我们把YOLOv3天体探测网络与一个定制的1D美元雷达分解网络结合起来,并评价我们在核Scenes数据集上的方法。我们特别侧重于夜景,在那里,基于相机数据的物体探测网络的能力可能受到阻碍。我们的实验表明,这种认识不确定性的聚合方法也是非常模块化的,与单一传感器基线相比,其性能显著提高,并处于专门设计的深层浓缩方法的范围。