重新思考小热分类中的通用化 (Rethinking Generalization in Few-Shot Classification)

Single image-level annotations only correctly describe an often small subset of an image's content, particularly when complex real-world scenes are depicted. While this might be acceptable in many classification scenarios, it poses a significant challenge for applications where the set of classes differs significantly between training and test time. In this paper, we take a closer look at the implications in the context of $\textit{few-shot learning}$. Splitting the input samples into patches and encoding these via the help of Vision Transformers allows us to establish semantic correspondences between local regions across images and independent of their respective class. The most informative patch embeddings for the task at hand are then determined as a function of the support set via online optimization at inference time, additionally providing visual interpretability of `$\textit{what matters most}$' in the image. We build on recent advances in unsupervised training of networks via masked image modelling to overcome the lack of fine-grained labels and learn the more general statistical structure of the data while avoiding negative image-level annotation influence, $\textit{aka}$ supervision collapse. Experimental results show the competitiveness of our approach, achieving new state-of-the-art results on four popular few-shot classification benchmarks for $5$-shot and $1$-shot scenarios.

翻译：单一图像级别说明仅正确描述图像内容中通常很少的一小部分内容, 特别是在描述复杂的真实世界场景时。虽然在许多分类设想中, 这也许可以被接受, 但对于一系列班级在培训和测试时间之间差异很大的应用来说, 却是一个巨大的挑战。在本文中, 我们更仔细地审视$\ textit{ few-shot learning} 的影响。将输入样本分解成补丁, 并在视野变异器的帮助下将这些样本编码起来, 使我们能够在本地区域之间建立图像之间的语义通信, 并且独立于各自的类别。手头任务最丰富的信息化补丁会被确定为通过在线优化推论时间设定的支持功能, 额外提供图像中“ $\ textit{ what what matter} $的视觉可解释性。我们借助最近通过蒙蔽图像建模对网络进行的不受监督的培训, 克服微缩略标签的缺失, 并学习数据更普遍的统计结构, 同时避免负面的图像级影响, $\ textital {ak_ abas- squtal- pass assing the falomate image view makedulock shage makedustragement

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日