We study the problem of weakly semi-supervised object detection with points (WSSOD-P), where the training data is combined by a small set of fully annotated images with bounding boxes and a large set of weakly-labeled images with only a single point annotated for each instance. The core of this task is to train a point-to-box regressor on well-labeled images that can be used to predict credible bounding boxes for each point annotation. We challenge the prior belief that existing CNN-based detectors are not compatible with this task. Based on the classic R-CNN architecture, we propose an effective point-to-box regressor: Group R-CNN. Group R-CNN first uses instance-level proposal grouping to generate a group of proposals for each point annotation and thus can obtain a high recall rate. To better distinguish different instances and improve precision, we propose instance-level proposal assignment to replace the vanilla assignment strategy adopted in the original R-CNN methods. As naive instance-level assignment brings converging difficulty, we propose instance-aware representation learning which consists of instance-aware feature enhancement and instance-aware parameter generation to overcome this issue. Comprehensive experiments on the MS-COCO benchmark demonstrate the effectiveness of our method. Specifically, Group R-CNN significantly outperforms the prior method Point DETR by 3.9 mAP with 5% well-labeled images, which is the most challenging scenario. The source code can be found at https://github.com/jshilong/GroupRCNN
翻译:我们研究的是以点数(WGRE-P)对目标进行半监督不力的检测的问题,在这些点数中,培训数据由一组带捆绑框的带完全注解的图像和一大批标签不全的图像相结合,每个实例只加一个注解的单点。我们的任务核心是,在每点注注解的贴标签图像上训练一个点到箱的递减器。为了更好地区分不同的情况,提高精确度,我们建议用实例级别建议取代原有R-CNN方法中采用的香草分配战略。由于以CNNN为主的经典R-CNN结构,我们建议一个有效的点到框递增的图像:R-CNN集团。R-CNN首先使用实例级建议分组,为每点注解提供一组建议,为每点注解提供一组建议,从而获得一个高清晰度的RBAR-RC/CO 标准值。我们建议用原始的R-CNRRC-CO 标准级别任务带来了连结的困难,我们建议用实例-awre 来学习一个最具挑战性的 IMAR-greal IM-gramal 的模型,通过前的G-reval-reval 方法来展示的模型的模型的模型的升级方法来展示。