This work aims to solve the challenging few-shot object detection problem where only a few annotated examples are available for each object category to train a detection model. Such an ability of learning to detect an object from just a few examples is common for human vision systems, but remains absent for computer vision systems. Though few-shot meta learning offers a promising solution technique, previous works mostly target the task of image classification and are not directly applicable for the much more complicated object detection task. In this work, we propose a novel meta-learning based model with carefully designed architecture, which consists of a meta-model and a base detection model. The base detection model is trained on several base classes with sufficient samples to offer basis features. The meta-model is trained to reweight importance of features from the base detection model over the input image and adapt these features to assist novel object detection from a few examples. The meta-model is light-weight, end-to-end trainable and able to entail the base model with detection ability for novel objects fast. Through experiments we demonstrated our model can outperform baselines by a large margin for few-shot object detection, on multiple datasets and settings. Our model also exhibits fast adaptation speed to novel few-shot classes.
翻译:这项工作旨在解决具有挑战性的微小物体探测问题,因为每个物体类别都只有几个附加说明的例子来训练探测模型。这种学习从几个例子中探测物体的能力对于人类视觉系统来说是常见的,但对于计算机视觉系统来说仍然缺乏。虽然少发的元学习提供了有希望的解决办法技术,但以前的工作主要针对图像分类的任务,不能直接适用于更复杂的物体探测任务。在这项工作中,我们提出了一个具有精心设计的架构的新颖的元学习模型,其中包括一个元模型和一个基本探测模型。基础探测模型在几个基础班上进行培训,并有足够的样本来提供基础特征特征特征。元模型经过培训,使基础探测模型的特征比输入图像更加重要,并调整这些特征以协助从几个例子中探测新物体。元模型是轻量、端对端的训练,能够带来能够快速探测新物体的基本模型。通过实验,我们证明我们的模型可以大大超过基准基准,在多个数据集和设置的环境下,少数点物体探测几个点。我们的模型还展示了快速适应新速度。