Monocular 3D object detection has attracted great attention for its advantages in simplicity and cost. Due to the ill-posed 2D to 3D mapping essence from the monocular imaging process, monocular 3D object detection suffers from inaccurate depth estimation and thus has poor 3D detection results. To alleviate this problem, we propose to introduce the ground plane as a prior in the monocular 3d object detection. The ground plane prior serves as an additional geometric condition to the ill-posed mapping and an extra source in depth estimation. In this way, we can get a more accurate depth estimation from the ground. Meanwhile, to take full advantage of the ground plane prior, we propose a depth-align training strategy and a precise two-stage depth inference method tailored for the ground plane prior. It is worth noting that the introduced ground plane prior requires no extra data sources like LiDAR, stereo images, and depth information. Extensive experiments on the KITTI benchmark show that our method could achieve state-of-the-art results compared with other methods while maintaining a very fast speed. Our code and models are available at https://github.com/cfzd/MonoGround.
翻译:由于单镜成像过程的2D至3D绘图精髓不正确,单镜3D天体探测的深度估计不准确,因此检测结果差。为了缓解这一问题,我们提议将地面平面作为单眼3D天体探测的先期。以前地面平面作为错误的绘图的附加几何条件和深度估计的额外来源。这样,我们可以从地面得到更准确的深度估计。与此同时,为了充分利用地面平面之前的充分利用,我们提出了深度高度训练战略和精确的两阶段深度推断方法,为地面平面之前专门设计。值得注意的是,引进地面平面以前不需要额外的数据源,如LIDAR、立体图像和深度信息。关于KITTI基准的广泛实验表明,我们的方法与其他方法相比,能够取得最新的结果,同时保持非常快速的速度。我们的代码和模型可以在 https://github.com/cfczd/Mongrogrous。