Weakly Supervised Object Detection (WSOD), aiming to train detectors with only image-level annotations, has arisen increasing attention. Current state-of-the-art approaches mainly follow a two-stage training strategy whichintegrates a fully supervised detector (FSD) with a pure WSOD model. There are two main problems hindering the performance of the two-phase WSOD approaches, i.e., insufficient learning problem and strict reliance between the FSD and the pseudo ground truth (PGT) generated by theWSOD model. This paper proposes pseudo ground truth refinement network (PGTRNet), a simple yet effective method without introducing any extra learnable parameters, to cope with these problems. PGTRNet utilizes multiple bounding boxes to establish the PGT, mitigating the insufficient learning problem. Besides, we propose a novel online PGT refinement approach to steadily improve the quality of PGTby fully taking advantage of the power of FSD during the second-phase training, decoupling the first and second-phase models. Elaborate experiments are conducted on the PASCAL VOC 2007 benchmark to verify the effectiveness of our methods. Experimental results demonstrate that PGTRNet boosts the backbone model by 2.074% mAP and achieves the state-of-the-art performance, showing the significant potentials of the second-phase training.
翻译:目前,最先进的方法主要遵循一个两阶段培训战略,将一个完全监督的探测器(FSD)与纯的WSOD模式结合起来,有两个主要问题阻碍WSOD两阶段方法的运行,即学习不足和严格依赖FSD与WSOD模式产生的假地面真相模型(PGTRNet)之间的力量。本文提出假地面真相改进网络(PGTRNet),这是一个简单而有效的方法,不引入任何额外的可学习参数,以解决这些问题。PGTRNet利用多个捆绑箱建立PGT,减轻学习不足的问题。此外,我们提议采用新的在线PGTT改进方法,在第二阶段培训期间充分利用FSD的力量,分离第一和第二阶段模型。在PASAL VOC 2007年基准上进行了详细实验,以验证我们的方法的效能。实验性模型显示PSARVODM-MA阶段性能。