The inherent ambiguity in ground-truth annotations of 3D bounding boxes caused by occlusions, signal missing, or manual annotation errors can confuse deep 3D object detectors during training, thus deteriorating the detection accuracy. However, existing methods overlook such issues to some extent and treat the labels as deterministic. In this paper, we propose GLENet, a generative label uncertainty estimation framework adapted from conditional variational autoencoders, to model the one-to-many relationship between a typical 3D object and its potential ground-truth bounding boxes with latent variables. The label uncertainty generated by GLENet is a plug-and-play module and can be conveniently integrated into existing deep 3D detectors to build probabilistic detectors and supervise the learning of the localization uncertainty. Besides, we propose an uncertainty-aware quality estimator architecture in probabilistic detectors to guide the training of IoU-branch with predicted localization uncertainty. We incorporate the proposed methods into various popular base 3D detectors and observe that their performance is significantly boosted to the current state-of-the-art over the Waymo Open dataset and KITTI dataset.
翻译:3D 封隔、 信号缺失或人工批注错误造成的3D 边框地貌说明的内在模糊性,可能会混淆培训期间的深3D物体探测器,从而降低探测的准确性。但是,现有方法在某种程度上忽略了这些问题,并将标签视为确定性标签。在本文中,我们提议GLENet,这是一个基因化标签的不确定性估计框架,从有条件的变异自动编码器中改制,以模拟典型的3D对象与其潜在变量的潜在地貌框之间的一对多种关系。GLENet产生的标签不确定性是一个插件和游戏模块,可以方便地纳入现有的深3D 探测器,以建立概率探测器并监督对本地化不确定性的学习。此外,我们提议在概率探测器中建立一个不确定性质量估测器结构,以指导对IOU-branch进行预测的本地化不确定性的培训。我们将拟议方法纳入各种流行基3D 探测器,并观察到其性能大大提升到目前对路透数据和KIT 的状态。