Object detection involves two sub-tasks, i.e. localizing objects in an image and classifying them into various categories. For existing CNN-based detectors, we notice the widespread divergence between localization and classification, which leads to degradation in performance. In this work, we propose a mutual learning framework to modulate the two tasks. In particular, the two tasks are forced to learn from each other with a novel mutual labeling strategy. Besides, we introduce a simple yet effective IoU rescoring scheme, which further reduces the divergence. Moreover, we define a Spearman rank correlation-based metric to quantify the divergence, which correlates well with the detection performance. The proposed approach is general-purpose and can be easily injected into existing detectors such as FCOS and RetinaNet. We achieve a significant performance gain over the baseline detectors on the COCO dataset.
翻译:物体探测涉及两个子任务,即将物体在图像中定位并将其分为不同类别。对于现有的有线电视新闻网探测器,我们注意到定位和分类之间的差别很大,导致性能下降。在这项工作中,我们提议了一个相互学习的框架,以调整这两项任务。特别是,这两项任务被迫用一种新的共同标签战略相互学习。此外,我们引入了一个简单而有效的IOU重新组合计划,进一步缩小了差异。此外,我们定义了Spearman级的基于相关等级的指数,以量化差异,这与探测性能密切相关。拟议的方法是通用的,可以很容易地注入现有的探测器,如FCOS和Retinnet。我们比CO数据集的基准探测器取得了很大的绩效收益。