Self-supervised pre-training, based on the pretext task of instance discrimination, has fueled the recent advance in label-efficient object detection. However, existing studies focus on pre-training only a feature extractor network to learn transferable representations for downstream detection tasks. This leads to the necessity of training multiple detection-specific modules from scratch in the fine-tuning phase. We argue that the region proposal network (RPN), a common detection-specific module, can additionally be pre-trained towards reducing the localization error of multi-stage detectors. In this work, we propose a simple pretext task that provides an effective pre-training for the RPN, towards efficiently improving downstream object detection performance. We evaluate the efficacy of our approach on benchmark object detection tasks and additional downstream tasks, including instance segmentation and few-shot detection. In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance, with largest gains found in label-scarce settings.
翻译:以实例歧视为借口的自我监督的训练前工作,推动了最近在标签效率高的物体探测方面取得的进步;然而,现有研究只侧重于先训练一个特征提取网络,以学习下游探测任务的可转移代表方式;这导致必须在微调阶段从零开始对多个探测特定模块进行培训;我们认为,区域建议网络(RPN),一个共同的检测特定模块,还可以为减少多阶段探测器的定位错误进行预先培训;在这项工作中,我们提出一个简单的借口任务,为RPN提供有效的预先训练,以有效改进下游物体探测性能;我们评估我们的基准物体探测任务和其他下游任务,包括实例分解和少量探测任务的方法的有效性;与没有RPN预先培训的多阶段探测器相比,我们的方法能够不断改进下游任务性能,在标签卡塞环境中取得了最大成果。