Despite the promising results, existing oriented object detection methods usually involve heuristically designed rules, e.g., RRoI generation, rotated NMS. In this paper, we propose an end-to-end framework for oriented object detection, which simplifies the model pipeline and obtains superior performance. Our framework is based on DETR, with the box regression head replaced with a points prediction head. The learning of points is more flexible, and the distribution of points can reflect the angle and size of the target rotated box. We further propose to decouple the query features into classification and regression features, which significantly improves the model precision. Aerial images usually contain thousands of instances. To better balance model precision and efficiency, we propose a novel dynamic query design, which reduces the number of object queries in stacked decoder layers without sacrificing model performance. Finally, we rethink the label assignment strategy of existing DETR-like detectors and propose an effective label re-assignment strategy for improved performance. We name our method D2Q-DETR. Experiments on the largest and challenging DOTA-v1.0 and DOTA-v1.5 datasets show that D2Q-DETR outperforms existing NMS-based and NMS-free oriented object detection methods and achieves the new state-of-the-art.
翻译:尽管取得了有希望的成果,但现有的定向物体探测方法通常涉及超常设计的规则,例如RRoI生成,旋转NMS。在本文件中,我们提议了一个定向物体探测的端对端框架,简化了示范管道,并取得了优异性能。我们的框架以DETR为基础,用一个点预测头取代了盒式回归头;点的学习更加灵活,点的分布可以反映目标旋转框的角度和大小。我们进一步提议将查询特征分解为分类和回归特征,大大改进模型精度。空中图像通常包含数千个实例。为了更好地平衡模型的精确度和效率,我们提议一个新的动态查询设计,在不牺牲模型性能的情况下减少堆叠的解码层的物体查询数量。最后,我们重新考虑现有的DTR类探测器的标签分配战略,并提出有效的标签重新分配战略来改进性能。我们指定了我们的方法D2Q-DETR。关于最大且具有挑战性的DATA-v1.0和DATA-v1.5级图像通常包含数千个实例。为了更好地平衡模型的精确度和效率,我们提出了新的动态查询设计,我们提出了一个新的D2DEMS-DA-DARDS-DA-DS-DA-DA-DS-DA-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-DS-D-DS-DS-DS-DS-DS-D-D-D-DS-DS-D-D-D-D-D-D-D-D-D-D-D-D-DS-DS-D-D-DS-DS-DS-D-DS-DS-DS-DS-D-D-D-D-D-D-D-D-DS-D-D-DS-DS-DS-DS-D-D-DS-DS-D-D-D-D-D</s>