Accurate 3D object detection is essential for automated vehicles to navigate safely in complex real-world environments. Bird's Eye View (BEV) representations, which project multi-sensor data into a top-down spatial format, have emerged as a powerful approach for robust perception. Although BEV-based fusion architectures have demonstrated strong performance through multimodal integration, the effects of sensor occlusions, caused by environmental conditions such as fog, haze, or physical obstructions, on 3D detection accuracy remain underexplored. In this work, we investigate the impact of occlusions on both camera and Light Detection and Ranging (LiDAR) outputs using the BEVFusion architecture, evaluated on the nuScenes dataset. Detection performance is measured using mean Average Precision (mAP) and the nuScenes Detection Score (NDS). Our results show that moderate camera occlusions lead to a 41.3% drop in mAP (from 35.6% to 20.9%) when detection is based only on the camera. On the other hand, LiDAR sharply drops in performance only under heavy occlusion, with mAP falling by 47.3% (from 64.7% to 34.1%), with a severe impact on long-range detection. In fused settings, the effect depends on which sensor is occluded: occluding the camera leads to a minor 4.1% drop (from 68.5% to 65.7%), while occluding LiDAR results in a larger 26.8% drop (to 50.1%), revealing the model's stronger reliance on LiDAR for the task of 3D object detection. Our results highlight the need for future research into occlusion-aware evaluation methods and improved sensor fusion techniques that can maintain detection accuracy in the presence of partial sensor failure or degradation due to adverse environmental conditions.


翻译:精确的三维目标检测对于自动驾驶车辆在复杂现实环境中安全导航至关重要。鸟瞰图表示将多传感器数据投影为自上而下的空间格式,已成为实现鲁棒感知的强大方法。尽管基于BEV的融合架构通过多模态集成展现出优异性能,但由雾、霾或物理遮挡等环境条件引起的传感器遮挡对三维检测精度的影响尚未得到充分研究。本研究利用BEVFusion架构,在nuScenes数据集上评估了遮挡对相机和激光雷达输出的影响。检测性能通过平均精度均值与nuScenes检测分数进行量化。实验结果表明:在仅使用相机检测时,中度相机遮挡导致mAP下降41.3%(从35.6%降至20.9%);而激光雷达仅在重度遮挡下性能急剧下降,mAP降低47.3%(从64.7%降至34.1%),其中远距离检测受影响尤为严重。在融合设置中,影响程度取决于被遮挡的传感器类型:遮挡相机仅导致4.1%的小幅下降(从68.5%降至65.7%),而遮挡激光雷达则引起26.8%的显著下降(降至50.1%),这揭示了模型在三维目标检测任务中对激光雷达的强依赖性。本研究结果凸显了未来研究的必要性:需开发遮挡感知的评估方法,并改进传感器融合技术,以在部分传感器因恶劣环境条件失效或性能退化时维持检测精度。

0
下载
关闭预览

相关内容

图机器学习 2.2-2.4 Properties of Networks, Random Graph
图与推荐
10+阅读 · 2020年3月28日
SkeletonNet:完整的人体三维位姿重建方法
计算机视觉life
21+阅读 · 2019年1月21日
论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG
开放知识图谱
36+阅读 · 2018年3月30日
国家自然科学基金
0+阅读 · 2015年12月31日
国家自然科学基金
0+阅读 · 2014年12月31日
VIP会员
相关资讯
图机器学习 2.2-2.4 Properties of Networks, Random Graph
图与推荐
10+阅读 · 2020年3月28日
SkeletonNet:完整的人体三维位姿重建方法
计算机视觉life
21+阅读 · 2019年1月21日
论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG
开放知识图谱
36+阅读 · 2018年3月30日
相关基金
国家自然科学基金
0+阅读 · 2015年12月31日
国家自然科学基金
0+阅读 · 2014年12月31日
Top
微信扫码咨询专知VIP会员