Aiming at highly accurate object detection for connected and automated vehicles (CAVs), this paper presents a Deep Neural Network based 3D object detection model that leverages a three-stage feature extractor by developing a novel LIDAR-Camera fusion scheme. The proposed feature extractor extracts high-level features from two input sensory modalities and recovers the important features discarded during the convolutional process. The novel fusion scheme effectively fuses features across sensory modalities and convolutional layers to find the best representative global features. The fused features are shared by a two-stage network: the region proposal network (RPN) and the detection head (DH). The RPN generates high-recall proposals, and the DH produces final detection results. The experimental results show the proposed model outperforms more recent research on the KITTI 2D and 3D detection benchmark, particularly for distant and highly occluded instances.
翻译:本文着眼于对连接和自动化车辆(CAVs)进行高度精确的物体探测,提出了基于深神经网络的三维物体探测模型,该模型通过开发新的LIDAR-Camera聚合计划,利用三阶段地物提取器,拟议的地物提取器从两种输入感应模式中提取高层次的特征,并恢复了在革命过程中被抛弃的重要特征。新组合计划有效地将各种感应模式和变相层的特征结合起来,以找到最具代表性的全球特征。引信特征由一个两阶段网络共享:区域建议网络(RPN)和探测头(DH)。RPN产生高呼声建议,DH产生最后探测结果。实验结果显示,拟议的模型比最近对KITTI 2D和3D探测基准(特别是远处和高度隐蔽的事例)的研究更接近。