Accurate and reliable 3D object detection is vital to safe autonomous driving. Despite recent developments, the performance gap between stereo-based methods and LiDAR-based methods is still considerable. Accurate depth estimation is crucial to the performance of stereo-based 3D object detection methods, particularly for those pixels associated with objects in the foreground. Moreover, stereo-based methods suffer from high variance in the depth estimation accuracy, which is often not considered in the object detection pipeline. To tackle these two issues, we propose CG-Stereo, a confidence-guided stereo 3D object detection pipeline that uses separate decoders for foreground and background pixels during depth estimation, and leverages the confidence estimation from the depth estimation network as a soft attention mechanism in the 3D object detector. Our approach outperforms all state-of-the-art stereo-based 3D detectors on the KITTI benchmark.
翻译:准确可靠的三维物体探测对于安全自主驾驶至关重要。尽管最近出现了一些发展,但立体法和立体成像法之间的性能差距仍然很大。精确的深度估计对于立体立体物体探测方法的性能差距至关重要,特别是对于与前方物体有关的像素而言。此外,立体法在深度估计精度方面差异很大,在物体探测管道中往往不考虑这一点。为了解决这两个问题,我们提议CG-Stereo,这是一个以信任为指南的立体立体物体探测管道,在深度估测时对地表和背景像素分别使用分解码,并利用深度估计网络的信任估计作为3D物体探测器的软关注机制。我们的方法超越了KITTI基准上所有基于立体立体的状态探测器。