This paper presents novel hybrid architectures that combine grid- and point-based processing to improve the detection performance and orientation estimation of radar-based object detection networks. Purely grid-based detection models operate on a bird's-eye-view (BEV) projection of the input point cloud. These approaches suffer from a loss of detailed information through the discrete grid resolution. This applies in particular to radar object detection, where relatively coarse grid resolutions are commonly used to account for the sparsity of radar point clouds. In contrast, point-based models are not affected by this problem as they process point clouds without discretization. However, they generally exhibit worse detection performances than grid-based methods. We show that a point-based model can extract neighborhood features, leveraging the exact relative positions of points, before grid rendering. This has significant benefits for a subsequent grid-based convolutional detection backbone. In experiments on the public nuScenes dataset our hybrid architecture achieves improvements in terms of detection performance (19.7% higher mAP for car class than next-best radar-only submission) and orientation estimates (11.5% relative orientation improvement) over networks from previous literature.
翻译:本文介绍了将网格和点基处理相结合的新型混合结构,以改善雷达物体探测网络的探测性能和定向估计,纯粹基于网格的探测模型在输入点云的鸟眼视图(BEV)投射中运行。这些方法因通过离散网格分辨率丢失详细信息而受损。这特别适用于雷达物体探测,在雷达物体探测中,通常使用相对粗糙的网格分辨率来计算雷达点云的广度。相比之下,点基模型没有受到这一问题的影响,因为它们在没有离散的情况下处理点云。然而,它们一般显示的探测性能比基于网格的方法差。我们表明,基于点的模型可以利用电网形成之前的准确相对位置来提取周边特征。这对随后的基于网格的相向探测主干线有很大好处。在公共核网网数据集的实验中,我们的混合结构在探测性能方面得到了改进(汽车级的 mAP比仅提交次最佳雷达的提交率高19.7%),以及定向估计(11.5%相对方向改进),这些都来自以前的文献中的网络。