Recently, two-stage Deformable DETR introduced the query-based two-stage head, a new type of two-stage head different from the region-based two-stage heads of classical detectors as Faster R-CNN. In query-based two-stage heads, the second stage selects one feature per detection, called the query, as opposed to pooling a rectangular grid of features as in region-based detectors. In this work, we further improve the query-based head from Deformable DETR, significantly speeding up the convergence while increasing its performance. This is achieved by incorporating classical techniques such as anchor generation within the query-based paradigm. By combining the best of both the classical and the query-based worlds, our FQDet head peaks at 45.4 AP on the 2017 COCO validation set when using a ResNet-50+TPN backbone, only after training for 12 epochs using the 1x schedule. We outperform other high-performing two-stage heads such as e.g. Cascade R-CNN, while using the same backbone and while often being computationally cheaper. Additionally, when using the large ResNeXt-101-DCN+TPN backbone and multi-scale testing, our FQDet head achieves 52.9 AP on the 2017 COCO test-dev set after only 12 epochs of training. Code will be released.
翻译:最近,两阶段变形的DETR引入了基于查询的两阶段首级,这是一种与基于区域的经典探测器双级首级不同的新型双级首级首级,即快速R-CNN。在基于查询的两阶段首级中,第二阶段选择了每个检测的一个特征,称为查询,而不是像在基于区域的探测器中那样集中一个矩形特征网格。在这项工作中,我们进一步改进了基于查询的脱形的DETR的双级首级首级,大大加快了趋同速度,同时提高了其性能。这是通过在基于查询的范式中纳入传统技术,如锚头生成。在使用ResNet-50+TPN骨干时,我们的FQD峰峰在201717年的COCOCCO认证套45.4时,只有在对12个大区进行了培训之后,例如,CA-CNN,在使用相同的主干网和CON通常计算更便宜的情况下,在使用201717年的MS-101号主机级测试后,我们将在12级的AS-D FS-D标准测试后,将只进行。