We present Self-Ensembling Single-Stage object Detector (SE-SSD) for accurate and efficient 3D object detection in outdoor point clouds. Our key focus is on exploiting both soft and hard targets with our formulated constraints to jointly optimize the model, without introducing extra computation in the inference. Specifically, SE-SSD contains a pair of teacher and student SSDs, in which we design an effective IoU-based matching strategy to filter soft targets from the teacher and formulate a consistency loss to align student predictions with them. Also, to maximize the distilled knowledge for ensembling the teacher, we design a new augmentation scheme to produce shape-aware augmented samples to train the student, aiming to encourage it to infer complete object shapes. Lastly, to better exploit hard targets, we design an ODIoU loss to supervise the student with constraints on the predicted box centers and orientations. Our SE-SSD attains top performance compared with all prior published works. Also, it attains top precisions for car detection in the KITTI benchmark (ranked 1st and 2nd on the BEV and 3D leaderboards, respectively) with an ultra-high inference speed. The code is available at https://github.com/Vegeta2020/SE-SSD.
翻译:我们的主要重点是利用软目标和硬目标,利用我们精心设计的制约因素,共同优化模型,而无需在推理中引入额外的计算。具体地说,SESD包含一对教师和学生的SDS,我们设计了一个有效的基于IOU的匹配战略,以过滤教师的软目标,并使学生的预测与教师的预测保持一致。此外,为了最大限度地增加吸收教师的精练知识,我们设计一个新的增强能力计划,以制作强化的形状样本来培训学生,目的是鼓励它推导完整的对象形状。最后,为了更好地开发硬目标,我们设计了ODIOU损失,以监督在预测的箱中心和方向上受到限制的学生。我们的SE-SD比以往出版的所有作品都取得了顶级的性能。此外,在KITTI基准中(在BEVS/SV20中排名第1级和第2级标准)的汽车探测达到最高精确度,在BEVSVSV/高端标准中,在BSEV20/高端标准中,在BSESEV20/高端标准中分别获得最高精确度标准。