One-to-one matching is a crucial design in DETR-like object detection frameworks. It enables the DETR to perform end-to-end detection. However, it also faces challenges of lacking positive sample supervision and slow convergence speed. Several recent works proposed the one-to-many matching mechanism to accelerate training and boost detection performance. We revisit these methods and model them in a unified format of augmenting the object queries. In this paper, we propose two methods that realize one-to-many matching from a different perspective of augmenting images or image features. The first method is One-to-many Matching via Data Augmentation (denoted as DataAug-DETR). It spatially transforms the images and includes multiple augmented versions of each image in the same training batch. Such a simple augmentation strategy already achieves one-to-many matching and surprisingly improves DETR's performance. The second method is One-to-many matching via Feature Augmentation (denoted as FeatAug-DETR). Unlike DataAug-DETR, it augments the image features instead of the original images and includes multiple augmented features in the same batch to realize one-to-many matching. FeatAug-DETR significantly accelerates DETR training and boosts detection performance while keeping the inference speed unchanged. We conduct extensive experiments to evaluate the effectiveness of the proposed approach on DETR variants, including DAB-DETR, Deformable-DETR, and H-Deformable-DETR. Without extra training data, FeatAug-DETR shortens the training convergence periods of Deformable-DETR to 24 epochs and achieves 58.3 AP on COCO val2017 set with Swin-L as the backbone.
翻译:一对一匹配是DETR类对象探测框架的关键设计。 它使 DETR 能够进行端对端检测。 但是, 它也面临缺乏积极的样本监督以及缓慢趋同速度等挑战。 最近的一些工程提出了一对一匹配机制, 以加速培训和提高检测性能。 我们重新审视这些方法, 并以统一格式来增加对象查询, 并模拟它们。 在本文中, 我们提出两种方法, 从扩大图像或图像特性的不同角度实现一对一匹配。 与数据放大不同的是一对一匹配( 称为DataAug- DETR ) 。 它在空间上变换图像, 并包含每个图像的多个增强版本。 这样简单的增强战略已经实现了一对一匹配, 并令人惊讶地提高了 DETR 的性能。 第二种方法是通过 Featetar developy Developable Devalation 来实现一对一对一对一对一对一对一匹配, 与 FeatART- Streal- view 不同, 它在原始的 TR- dreal- developmental- developmental- development Adal- developmental- treving the F.</s>