Weakly supervised object detection (WSOD) focuses on training object detector with only image-level annotations, and is challenging due to the gap between the supervision and the objective. Most of existing approaches model WSOD as a multiple instance learning (MIL) problem. However, we observe that the result of MIL based detector is unstable, i.e., the most confident bounding boxes change significantly when using different initializations. We quantitatively demonstrate the instability by introducing a metric to measure it, and empirically analyze the reason of instability. Although the instability seems harmful for detection task, we argue that it can be utilized to improve the performance by fusing the results of differently initialized detectors. To implement this idea, we propose an end-to-end framework with multiple detection branches, and introduce a simple fusion strategy. We further propose an orthogonal initialization method to increase the difference between detection branches. By utilizing the instability, we achieve 52.6% and 48.0% mAP on the challenging PASCAL VOC 2007 and 2012 datasets, which are both the new state-of-the-arts.
翻译:微弱监督天体探测(WSOD)侧重于仅具有图像级说明的物体探测器,由于监督与目标之间的差距,这具有挑战性。大多数现有方法模式WSOD是多实例学习(MIL)问题。然而,我们注意到,MIL的探测器的结果不稳定,即使用不同初始化方法时最自信的捆绑盒会发生重大变化。我们通过引入测量标准,从经验上分析不稳定性的原因,从数量上表明不稳定性。虽然不稳定性似乎有害于检测任务,但我们认为,可以通过使用不同初始化探测器的结果来改进性能。为了实施这一想法,我们提出了一个配有多个检测分支的端到端框架,并引入一个简单的聚变战略。我们进一步提出一种或分级初始化方法,以扩大检测分支之间的差异。我们利用不稳定性,在具有挑战性的PASAL VOC 2007 和 2012 数据集上实现了52.6%和48.0% mAP,这是新的艺术。