Object detection using single point supervision has received increasing attention over the years. In this paper, we attribute such a large performance gap to the failure of generating high-quality proposal bags which are crucial for multiple instance learning (MIL). To address this problem, we introduce a lightweight alternative to the off-the-shelf proposal (OTSP) method and thereby create the Point-to-Box Network (P2BNet), which can construct an inter-objects balanced proposal bag by generating proposals in an anchor-like way. By fully investigating the accurate position information, P2BNet further constructs an instance-level bag, avoiding the mixture of multiple objects. Finally, a coarse-to-fine policy in a cascade fashion is utilized to improve the IoU between proposals and ground-truth (GT). Benefiting from these strategies, P2BNet is able to produce high-quality instance-level bags for object detection. P2BNet improves the mean average precision (AP) by more than 50% relative to the previous best PSOD method on the MS COCO dataset. It also demonstrates the great potential to bridge the performance gap between point supervised and bounding-box supervised detectors. The code will be released at github.com/ucas-vg/P2BNet.
翻译:多年来,使用单一点的监视器来检测单一对象,引起了越来越多的注意。 在本文件中,我们把如此巨大的性能差距归因于未能产生高质量建议袋,而这些建议袋对于多个实例学习(MIL)至关重要。为了解决这一问题,我们采用了现成建议(OTSP)方法的轻量替代方法,从而创建了点对点网络(P2BNet),该网络可以通过以类似锁定的方式生成建议来构建一个跨点平衡建议袋。通过充分调查准确位置信息,P2BNet进一步构建了一个实例级袋,避免了多个对象的混合。最后,我们利用了串联式的粗到点政策来改进IOU在建议和地盘建议(GT)之间的作用。从这些战略中受益,P2BNet能够生成高质量的实例级袋,用于检测物体。P2BNet将平均精确度提高50%以上,这与MS COCO数据集上前最佳的PSOD方法相比,从而避免了多个对象的混合。它还展示了在监控点/绑定的测试器之间缩小性差。