动态松散 R-CNN (Dynamic Sparse R-CNN)

Sparse R-CNN is a recent strong object detection baseline by set prediction on sparse, learnable proposal boxes and proposal features. In this work, we propose to improve Sparse R-CNN with two dynamic designs. First, Sparse R-CNN adopts a one-to-one label assignment scheme, where the Hungarian algorithm is applied to match only one positive sample for each ground truth. Such one-to-one assignment may not be optimal for the matching between the learned proposal boxes and ground truths. To address this problem, we propose dynamic label assignment (DLA) based on the optimal transport algorithm to assign increasing positive samples in the iterative training stages of Sparse R-CNN. We constrain the matching to be gradually looser in the sequential stages as the later stage produces the refined proposals with improved precision. Second, the learned proposal boxes and features remain fixed for different images in the inference process of Sparse R-CNN. Motivated by dynamic convolution, we propose dynamic proposal generation (DPG) to assemble multiple proposal experts dynamically for providing better initial proposal boxes and features for the consecutive training stages. DPG thereby can derive sample-dependent proposal boxes and features for inference. Experiments demonstrate that our method, named Dynamic Sparse R-CNN, can boost the strong Sparse R-CNN baseline with different backbones for object detection. Particularly, Dynamic Sparse R-CNN reaches the state-of-the-art 47.2% AP on the COCO 2017 validation set, surpassing Sparse R-CNN by 2.2% AP with the same ResNet-50 backbone.

翻译：R-CNN是最近一个强大的物体探测基准,对稀有、可学习的建议框和提议特点进行预测。在这项工作中,我们提议用两种动态设计改进Sprass R-CNN。首先,Sprass R-CNN采用一对一标签分配办法,即匈牙利算法仅用于对每个地面真相进行一个正面抽样。这种一对一分配可能不是匹配学习到的建议框和地面真相的最佳方法。为了解决这一问题,我们提议根据最佳运输算法,在Sprass R-CNN的迭接培训阶段分配越来越多的正样。我们限制配对在后一个阶段逐渐松开。第二,匈牙利算法只用于对每个地面真相进行一个正面抽样。这种一对一分配可能不是匹配所学建议框和地面真理的最佳方法。为了解决这个问题,我们提议动态的多位建议专家动态地组成一个更好的初始建议框和连续培训阶段Sprass RNNN的正标本。我们限制配对在后一个阶段逐渐松动的顺序阶段逐渐松动。DGPG,因此,可以得出基于样品的精度基准的R-R-R-S-S-SARSARSAR标准测试, 的精度测试系统。

相关内容

R-CNN

关注 26

R-CNN的全称是Region-CNN，它可以说是是第一个成功将深度学习应用到目标检测上的算法。传统的目标检测方法大多以图像识别为基础。一般可以在图片上使用穷举法选出所所有物体可能出现的区域框，对这些区域框提取特征并使用图像识别方法分类，得到所有分类成功的区域后,通过非极大值抑制(Non-maximumsuppression)输出结果。

【CVPR 2022】实时实例分割的稀疏实例激活，Sparse Instance Activation for Real-Time Instance Segmentation

专知会员服务

8+阅读 · 2022年3月12日

【CVPR 2022】基于windows的图像压缩注意，The Devil Is in the Details: Window-based Attention for Image Compression

专知会员服务

8+阅读 · 2022年3月12日

近期必读的六篇计算机视觉顶会ECCV 2020【目标检测】相关论文

专知会员服务

59+阅读 · 2020年7月7日

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

专知会员服务

26+阅读 · 2020年3月16日