The task of Few-shot Learning (FSL) aims to do the inference on novel categories containing only few labeled examples, with the help of knowledge learned from base categories containing abundant labeled training samples. While there are numerous works into FSL task, Vision Transformers (ViTs) have rarely been taken as the backbone to FSL with few trials focusing on naive finetuning of whole backbone or classification layer.} Essentially, despite ViTs have been shown to enjoy comparable or even better performance on other vision tasks, it is still very nontrivial to efficiently finetune the ViTs in real-world FSL scenarios. To this end, we propose a novel efficient Transformer Tuning (eTT) method that facilitates finetuning ViTs in the FSL tasks. The key novelties come from the newly presented Attentive Prefix Tuning (APT) and Domain Residual Adapter (DRA) for the task and backbone tuning, individually. Specifically, in APT, the prefix is projected to new key and value pairs that are attached to each self-attention layer to provide the model with task-specific information. Moreover, we design the DRA in the form of learnable offset vectors to handle the potential domain gaps between base and novel data. To ensure the APT would not deviate from the initial task-specific information much, we further propose a novel prototypical regularization, which maximizes the similarity between the projected distribution of prefix and initial prototypes, regularizing the update procedure. Our method receives outstanding performance on the challenging Meta-Dataset. We conduct extensive experiments to show the efficacy of our model.
翻译:少见的学习任务(FSL)旨在利用从含有大量标签培训样本的基本类别中获取的知识,对仅包含少量标签实例的新颖类别作出推论。 虽然FSL任务中有许多工作,但愿景变换器(VVTs)很少被作为FSL的骨干,很少侧重于整个骨干或分类层的天性微调。}基本上,尽管VITs已经显示在其他愿景任务中享有可比甚至更好的表现,但在现实世界FSL情景中高效地微调VITs。为此,我们提议一种新型的高效变换器(eTTT)方法,便于FSL任务中的VTs(VTs)的微调。 关键的新颖之处来自新提出的Attantive Prefix Turning(APT) 和 Domain 残余调整器(DRA), 单个任务和主干线调整前的工作, 具体任务前的预估是新的关键和价值配对,我们每个自我观察的自我定位结构层的每个自我定位结构中附加的新的关键和新配对, 常规的变换的模型, 将无法在任务中学习基础数据中学习的变压中, 方向上, 排序中, 将显示我们相对的变现的变压的路径上的数据。