Pseudo-LiDAR-based methods for monocular 3D object detection have received considerable attention in the community due to the performance gains exhibited on the KITTI3D benchmark, in particular on the commonly reported validation split. This generated a distorted impression about the superiority of Pseudo-LiDAR-based (PL-based) approaches over methods working with RGB images only. Our first contribution consists in rectifying this view by pointing out and showing experimentally that the validation results published by PL-based methods are substantially biased. The source of the bias resides in an overlap between the KITTI3D object detection validation set and the training/validation sets used to train depth predictors feeding PL-based methods. Surprisingly, the bias remains also after geographically removing the overlap. This leaves the test set as the only reliable set for comparison, where published PL-based methods do not excel. Our second contribution brings PL-based methods back up in the ranking with the design of a novel deep architecture which introduces a 3D confidence prediction module. We show that 3D confidence estimation techniques derived from RGB-only 3D detection approaches can be successfully integrated into our framework and, more importantly, that improved performance can be obtained with a newly designed 3D confidence measure, leading to state-of-the-art performance on the KITTI3D benchmark.
翻译:由于在KITTI3D基准上展示的绩效收益,特别是通常报告的验证分解,基于单眼三维天体探测的单眼三维天体探测方法在社区中受到相当的重视,这给基于Pseudo-LiDAR(基于PL)方法优于仅使用RGB图像的方法造成了扭曲的印象。我们的第一个贡献在于纠正这一看法,指出并实验显示,基于PL方法公布的验证结果有很大偏差。偏差的根源在于KITTI3D天体探测验证组与用于培训深度预测器以输入以PLL3为基础的方法的培训/鉴定组之间的重叠。令人惊讶的是,这种偏差在地理上消除重叠之后仍然存在。这留下的测试作为唯一可靠的比较设置,因为公布的基于PLTF的方法并不优秀。我们的第二个贡献使基于PLS的方法重新排在设计一个新的深层次结构的排名中,该结构引入了3D信任预测模块。我们表明,从RGB3D天体探测中得出的三维天体估计技术,可以成功地将一个新的业绩基准纳入我们的国家框架。