The inherent ambiguity in ground-truth annotations of 3D bounding boxes caused by occlusions, signal missing, or manual annotation errors can confuse deep 3D object detectors during training, thus deteriorating the detection accuracy. However, existing methods overlook such issues to some extent and treat the labels as deterministic. In this paper, we formulate the label uncertainty problem as the diversity of potentially plausible bounding boxes of objects, then propose GLENet, a generative framework adapted from conditional variational autoencoders, to model the one-to-many relationship between a typical 3D object and its potential ground-truth bounding boxes with latent variables. The label uncertainty generated by GLENet is a plug-and-play module and can be conveniently integrated into existing deep 3D detectors to build probabilistic detectors and supervise the learning of the localization uncertainty. Besides, we propose an uncertainty-aware quality estimator architecture in probabilistic detectors to guide the training of IoU-branch with predicted localization uncertainty. We incorporate the proposed methods into various popular base 3D detectors and demonstrate significant and consistent performance gains on both KITTI and Waymo benchmark datasets. Especially, the proposed GLENet-VR outperforms all published LiDAR-based approaches by a large margin and ranks $1^{st}$ among single-modal methods on the challenging KITTI test set. The code is available at https://github.com/Eaphan/GLENet.
翻译:3D 封隔、 信号缺失或人工批注错误导致的3D 捆绑框的地真图说明的内在模糊性,可能会混淆培训期间的深3D对象探测器,从而降低探测准确性。然而,现有方法在某种程度上忽略了这些问题,并将标签视为确定性。在本文件中,我们将标签不确定性问题表述为潜在可信的物体捆绑盒的多样性,然后提议GLENet,这是一个从有条件的变异自动解析器中改制的基因化框架,以模拟典型的3D对象与其潜在的地真真图绑框之间的一对多种关系,与潜在的变量。GLENet生成的标签不确定性是一个插件和游戏模块,可以方便地纳入现有的深3D 探测器,以建立概率探测器并监督对本地化不确定性的学习。此外,我们提议了一个具有预测本地变异性探测器的不确定性质量估测仪结构,以指导IOU- 勃朗的本地化不确定性培训。我们将拟议的方法纳入各种大众基 3D 探测器中,并展示了在基于LEO- D 级的大型测试方法上推出的重要和一致的成绩。