Few-shot object detection (FSOD) aims at learning a detector that can fast adapt to previously unseen objects with scarce annotated examples, which is challenging and demanding. Existing methods solve this problem by performing subtasks of classification and localization utilizing a shared component (e.g., RoI head) in the detector, yet few of them take the distinct preferences of two subtasks towards feature embedding into consideration. In this paper, we carefully analyze the characteristics of FSOD, and present that a general few-shot detector should consider the explicit decomposition of two subtasks, as well as leveraging information from both of them to enhance feature representations. To the end, we propose a simple yet effective Adaptive Fully-Dual Network (AFD-Net). Specifically, we extend Faster R-CNN by introducing Dual Query Encoder and Dual Attention Generator for separate feature extraction, and Dual Aggregator for separate model reweighting. Spontaneously, separate state estimation is achieved by the R-CNN detector. Besides, for the acquisition of enhanced feature representations, we further introduce Adaptive Fusion Mechanism to adaptively perform feature fusion in different subtasks. Extensive experiments on PASCAL VOC and MS COCO in various settings show that, our method achieves new state-of-the-art performance by a large margin, demonstrating its effectiveness and generalization ability.
翻译:微小的物体探测(FSOD)旨在学习能够快速适应先前看不见的物体的探测器,该探测器具有挑战性和要求性。现有方法通过在探测器中利用一个共享部件(如RoI头)进行分类和本地化子任务来解决该问题,但其中很少有人将两个子任务的不同偏好作为嵌入特性的考虑。在本文件中,我们仔细分析FSOD的特性,并表明一般的微小探测器应考虑两个子任务的明确分解,以及利用它们提供的信息加强地貌表现。最后,我们提出一个简单而有效的整体适应网络(AFD-Net),具体地说,我们扩大R-CNN,方法是采用双二次Query Encoder和双注意力发电机分别进行地貌采掘,和两组聚合器分别进行模型再加权。 R-CNN探测器自相调、单独的国家估计应当考虑两个子任务之间的明显分解,以及利用它们提供的信息加强地表现特征表现。此外,我们进一步引入了适应性的FSO-FA-FAL机制,在不同的次级系统中以适应性化方式进行硬性地磁测试。