Image-based 3D detection is an indispensable component of the perception system for autonomous driving. However, it still suffers from the unsatisfying performance, one of the main reasons for which is the limited training data. Unfortunately, annotating the objects in the 3D space is extremely time/resource-consuming, which makes it hard to extend the training set arbitrarily. In this work, we focus on the semi-supervised manner and explore the feasibility of a cheaper alternative, i.e. pseudo-labeling, to leverage the unlabeled data. For this purpose, we conduct extensive experiments to investigate whether the pseudo-labels can provide effective supervision for the baseline models under varying settings. The experimental results not only demonstrate the effectiveness of the pseudo-labeling mechanism for image-based 3D detection (e.g. under monocular setting, we achieve 20.23 AP for moderate level on the KITTI-3D testing set without bells and whistles, improving the baseline model by 6.03 AP), but also show several interesting and noteworthy findings (e.g. the models trained with pseudo-labels perform better than that trained with ground-truth annotations based on the same training data). We hope this work can provide insights for the image-based 3D detection community under a semi-supervised setting. The codes, pseudo-labels, and pre-trained models will be publicly available.
翻译:基于图像的 3D 探测是自动驾驶感知系统不可或缺的组成部分。 然而,它仍然受到不满意性能的困扰,这是培训数据有限的主要原因之一。 不幸的是, 3D 空间的标注耗时/ 资源耗资巨大, 因而很难任意扩展培训集。 在这项工作中, 我们把重点放在半监督方式上, 并探索一种更廉价的替代方法的可行性, 即假标签, 以利用未贴标签的数据。 为此, 我们进行了广泛的实验, 调查假标签能否在不同环境下为基线模型提供有效的监督。 实验结果不仅展示了基于图像的3D 探测的伪标签机制的有效性( 例如,在单体环境下, 我们实现了20.23 AP, 在KITTI-3D 测试集中度, 没有铃声和哨子, 改进了6.03 AP 的基线模型 ), 但也展示了几个令人感兴趣和值得注意的调查结果( 例如, 我们训练的假标签模型比经过训练的以地面探测仪的模型更好, 3D 能够提供同一数据 。