In this paper, we provide the observation that too few queries assigned as positive samples in DETR with one-to-one set matching leads to sparse supervisions on the encoder's output which considerably hurt the discriminative feature learning of the encoder and vice visa for attention learning in the decoder. To alleviate this, we present a novel collaborative hybrid assignments training scheme, namely Co-DETR, to learn more efficient and effective DETR-based detectors from versatile label assignment manners. This new training scheme can easily enhance the encoder's learning ability in end-to-end detectors by training the multiple parallel auxiliary heads supervised by one-to-many label assignments such as ATSS, FCOS, and Faster RCNN. In addition, we conduct extra customized positive queries by extracting the positive coordinates from these auxiliary heads to improve the training efficiency of positive samples in the decoder. In inference, these auxiliary heads are discarded and thus our method introduces no additional parameters and computational cost to the original detector while requiring no hand-crafted non-maximum suppression (NMS). We conduct extensive experiments to evaluate the effectiveness of the proposed approach on DETR variants, including DAB-DETR, Deformable-DETR, and DINO-Deformable-DETR. Specifically, we improve the basic Deformable-DETR by 5.8% in 12-epoch training and 3.2% in 36-epoch training. The state-of-the-art DINO-Deformable-DETR can still be improved from 49.4% to 51.2% on the MS COCO val. Surprisingly, incorporated with the large-scale backbone MixMIM-g with 1-Billion parameters, we achieve the 64.5% mAP on MS COCO test-dev, achieving superior performance with much fewer extra data sizes. Codes will be available at https://github.com/Sense-X/Co-DETR.
翻译:在本文中,我们提出这样的观察,即在DETR中作为正样本的查询数量太少,只有一对一对一匹配,导致对编码器输出的监管不力,这大大伤害了编码器编码和副签证的歧视性特征学习,以在编码器中学习注意力。为了缓解这一点,我们提出了一个新型的合作混合任务培训计划,即共同DETR,目的是从多功能标签分配方式中学习更有效率和更有效的 DERTR探测器。这个新的培训计划可以通过培训多个平行的辅助头目,例如ATSS、FCOS和Apper RCNN等一对许多标签任务的监督,从而增强编码器的学习能力。此外,我们通过从这些辅助头部提取积极的坐标,提高对正态解码样本的培训效率,这些辅助头被丢弃,因此我们的方法不会给原探测器带来额外的参数和计算成本,同时不需要手制式的升级的 ODRS-DORD4, 我们进行了广泛的实验,用来评估拟议中的DTRDIM-DO-DORDS-deg-toimal train Instal deal deal deal deal ex ex ex ex ex commodustration ex the the listruteal-deal-to the dol-dal-dal-deal-dal-dal-dal-dal-dal-dal-dal-dal-dal-dal-d-d-d-dal-dal-dal-dal-dal-degal-degal-d-dal-dal-d-d-d-d-dal-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-dal-d-d-d-d-d-d-d-d-d-dald-d-d-d-dal-daldaldal-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-