In recent years, great progress has been made in the Lift-Splat-Shot-based (LSS-based) 3D object detection method, which converts features of 2D camera view and 3D lidar view to Bird's-Eye-View (BEV) for feature fusion. However, inaccurate depth estimation (e.g. the 'depth jump' problem) is an obstacle to develop LSS-based methods. To alleviate the 'depth jump' problem, we proposed Edge-Aware Bird's-Eye-View (EA-BEV) projector. By coupling proposed edge-aware depth fusion module and depth estimate module, the proposed EA-BEV projector solves the problem and enforces refined supervision on depth. Besides, we propose sparse depth supervision and gradient edge depth supervision, for constraining learning on global depth and local marginal depth information. Our EA-BEV projector is a plug-and-play module for any LSS-based 3D object detection models, and effectively improves the baseline performance. We demonstrate the effectiveness on the nuScenes benchmark. On the nuScenes 3D object detection validation dataset, our proposed EA-BEV projector can boost several state-of-the-art LLS-based baselines on nuScenes 3D object detection benchmark and nuScenes BEV map segmentation benchmark with negligible increment of inference time.
翻译:近年来,Lift-Splat-Shot(LSS) 的三维目标检测方法取得了巨大的进展,该方法将摄像机视图和三维激光雷达融合到鸟瞰图中。然而,不准确的深度估计(如“深度跳变”问题)是开发 LSS 方法的障碍。为了缓解“深度跳变”问题,我们提出了边缘感知鸟瞰投影仪。通过耦合提出的边缘感知深度融合模块和深度估计模块,所提出的边缘感知鸟瞰投影仪解决了这个问题,并对深度执行了精细的监督约束。此外,我们提出了稀疏深度监督和梯度边缘深度监督,以约束学习全局深度和局部边缘深度信息。我们的边缘感知鸟瞰投影仪是任何 LSS-based 三维目标检测模型的即插即用模块,有效地提高了基线性能。我们在 nunscenes 基准测试中展示了其有效性。在 nunscenes 三维目标检测验证数据集上,我们提出的边缘感知鸟瞰投影仪可以提升几个最先进的 LLS-based 基线在 nunscenes 3D 目标检测基准测试和 nunscenes BEV 地图分割基准测试的性能,而推理时间略微增加。