Most of existing methods for few-shot object detection follow the fine-tuning paradigm, which potentially assumes that the class-agnostic generalizable knowledge can be learned and transferred implicitly from base classes with abundant samples to novel classes with limited samples via such a two-stage training strategy. However, it is not necessarily true since the object detector can hardly distinguish between class-agnostic knowledge and class-specific knowledge automatically without explicit modeling. In this work we propose to learn three types of class-agnostic commonalities between base and novel classes explicitly: recognition-related semantic commonalities, localization-related semantic commonalities and distribution commonalities. We design a unified distillation framework based on a memory bank, which is able to perform distillation of all three types of commonalities jointly and efficiently. Extensive experiments demonstrate that our method can be readily integrated into most of existing fine-tuning based methods and consistently improve the performance by a large margin.
翻译:微调模式可能假定,通过这种两阶段培训战略,可以从具有丰富样品的基层班级中间接地学习和将可计量的普通知识传授给具有有限样品的新班级,但不一定是真实的,因为物体探测器几乎无法在没有明确模型的情况下,自动区分分类知识与特定类别知识。在这项工作中,我们提议明确学习基础班级和新班级之间三种类别不可识别的共性:与识别有关的语义共性、与本地化有关的语义共性和分布共性。我们设计了一个统一的蒸馏框架,以记忆库为基础,能够共同和有效地对所有三种类型共性进行蒸馏。广泛的实验表明,我们的方法可以很容易地纳入大多数现有的微调方法,并不断以大幅度改进性能。