The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extractors, classification tasks with as little as one example (or "shot'') for each class can be solved with high accuracy, even when the shots display individual features not representative of their classes. Yet, the problem becomes more complicated when some of the given shots display multiple objects. In this paper, we present a strategy which aims at detecting the presence of multiple and previously unseen objects in a given shot. This methodology is based on identifying the corners of a simplex in a high dimensional space. We introduce an optimization routine and showcase its ability to successfully detect multiple (previously unseen) objects in raw images. Then, we introduce a downstream classifier meant to exploit the presence of multiple objects to improve the performance of few-shot classification, in the case of extreme settings where only one shot is given for its class. Using standard benchmarks of the field, we show the ability of the proposed method to slightly, yet statistically significantly, improve accuracy in these settings.
翻译:视觉微小的分类领域旨在将深层次学习视觉系统最先进的性能转换到只有非常有限的培训样本数量的任务上。主要的解决办法是培训一个功能提取器,对考虑的微小任务应用大量和多样的数据集。由于特征提取器中的编码前缀,每个类的分类任务可以非常精确地解决,即使是在镜头显示不代表其类的个别特性时也是如此。然而,当某些特定镜头显示多个对象时,问题就变得更加复杂。在本文件中,我们提出了一个战略,目的是在特定镜头中发现多个和以前未见的物体的存在。这个方法基于在高维度空间中识别一个简单字的角。我们引入一个优化的常规,并展示其成功探测原始图像中多个(以前是看不见的)物体的能力。然后,我们引入一个下游的分类器,目的是利用多个对象的存在来改进微小的分类的性能。在极端环境中,只有一张镜头能够显示其分类的准确性能。我们用这些标准基准来微小地显示其类别中的统计性能。