In this paper, we delve into two key techniques in Semi-Supervised Object Detection (SSOD), namely pseudo labeling and consistency training. We observe that these two techniques currently neglect some important properties of object detection, hindering efficient learning on unlabeled data. Specifically, for pseudo labeling, existing works only focus on the classification score yet fail to guarantee the localization precision of pseudo boxes; For consistency training, the widely adopted random-resize training only considers the label-level consistency but misses the feature-level one, which also plays an important role in ensuring the scale invariance. To address the problems incurred by noisy pseudo boxes, we design Noisy Pseudo box Learning (NPL) that includes Prediction-guided Label Assignment (PLA) and Positive-proposal Consistency Voting (PCV). PLA relies on model predictions to assign labels and makes it robust to even coarse pseudo boxes; while PCV leverages the regression consistency of positive proposals to reflect the localization quality of pseudo boxes. Furthermore, in consistency training, we propose Multi-view Scale-invariant Learning (MSL) that includes mechanisms of both label- and feature-level consistency, where feature consistency is achieved by aligning shifted feature pyramids between two images with identical content but varied scales. On COCO benchmark, our method, termed PSEudo labeling and COnsistency training (PseCo), outperforms the SOTA (Soft Teacher) by 2.0, 1.8, 2.0 points under 1%, 5%, and 10% labelling ratios, respectively. It also significantly improves the learning efficiency for SSOD, e.g., PseCo halves the training time of the SOTA approach but achieves even better performance. Code is available at https://github.com/ligang-cs/PseCo.
翻译:在本文中,我们深入到半超对象探测(裁军特别联大)的两个关键技术,即假标签和一致性培训。我们观察到,这两个技术目前忽略了物体探测的某些重要特性,妨碍了对未贴标签数据的有效学习。具体地说,在伪标签方面,现有工作的重点只是分类分数,但未能保证伪箱的本地化精确度;关于一致性培训,广泛采用的随机调整培训只考虑标签水平的一致性,却忽略了一级功能,这也在确保变异比例方面起着重要作用。为了解决噪音假箱引起的问题,我们设计了Nisy Psedo箱学习(NPL),这包括预测-引导 Label 任务(PLA)和积极-建议一致投票投票投票(PCV),它仅以模型预测的方式分配标签,使其更稳健到甚至粗化的假箱;而PCV则利用正性提案的回归性一致性,以反映假箱的本地化质量。此外,在一致性培训中,我们建议多视图的Sedele-Instead(MSL) 学习(MS-L),这包括SO-CO-COBIL) 格式的功能的功能上,其特性的特性和标准的升级的功能上都实现了。