One-to-one set matching is a key design for DETR to establish its end-to-end capability, so that object detection does not require a hand-crafted NMS (non-maximum suppression) to remove duplicate detections. This end-to-end signature is important for the versatility of DETR, and it has been generalized to broader vision tasks. However, we note that there are few queries assigned as positive samples and the one-to-one set matching significantly reduces the training efficacy of positive samples. We propose a simple yet effective method based on a hybrid matching scheme that combines the original one-to-one matching branch with an auxiliary one-to-many matching branch during training. Our hybrid strategy has been shown to significantly improve accuracy. In inference, only the original one-to-one match branch is used, thus maintaining the end-to-end merit and the same inference efficiency of DETR. The method is named H-DETR, and it shows that a wide range of representative DETR methods can be consistently improved across a wide range of visual tasks, including DeformableDETR, PETRv2, PETR, and TransTrack, among others. The code is available at: https://github.com/HDETR
翻译:一对一集合匹配是DETR建立其端到端能力的关键设计,因此物体检测不需要手工制作的NMS来删除重复检测。此端到端特征对于DETR的通用性非常重要,并且已经推广到更广泛的视觉任务中。然而,我们注意到有很少的查询被分配为正样本,并且一对一集合匹配显着降低了正样本的训练效果。我们提出了一种简单而有效的方法,基于混合匹配方案,将原始的一对一匹配分支与辅助的一对多匹配分支在训练期间结合起来。我们的混合策略已被证明可以显着提高准确性。在推理中,仅使用原始的一对一匹配分支,从而保持DETR的端到端优点和相同的推理效率。该方法被命名为H-DETR,它表明广泛的代表性DETR方法可以在包括DeformableDETR,PETRv2,PETR和TransTrack在内的广泛的视觉任务中得到一致的改进。可在https://github.com/HDETR上找到代码。