Most of the recent few-shot learning (FSL) algorithms are based on transfer learning, where a model is pre-trained using a large amount of source data, and the pre-trained model is fine-tuned using a small amount of target data. In transfer learning-based FSL, sophisticated pre-training methods have been widely studied for universal representation. Therefore, it has become more important to utilize the universal representation for downstream tasks, but there are few studies on fine-tuning in FSL. In this paper, we focus on how to transfer pre-trained models to few-shot downstream tasks from the three perspectives: update, data augmentation, and test-time augmentation. First, we compare the two popular update methods, full fine-tuning (i.e., updating the entire network, FT) and linear probing (i.e., updating only a linear classifier, LP). We find that LP is better than FT with extremely few samples, whereas FT outperforms LP as training samples increase. Next, we show that data augmentation cannot guarantee few-shot performance improvement and investigate the effectiveness of data augmentation based on the intensity of augmentation. Finally, we adopt augmentation to both a support set for update (i.e., data augmentation) as well as a query set for prediction (i.e., test-time augmentation), considering support-query distribution shifts, and improve few-shot performance. The code is available at https://github.com/kimyuji/updating_FSL.
翻译:最近的少数学习(FSL)算法大多以转让学习为基础,其中模型是使用大量源数据进行预先训练的,而培训前模型是使用少量目标数据进行微调。在以学习为基础的FSL中,对复杂的培训前方法进行了广泛研究,以普及代表性。因此,将通用代表制用于下游任务已经变得更加重要,但对FSL的微调研究很少。在本文中,我们侧重于如何将预先训练的模型转换到从三个角度(更新、数据增强和测试时间增强)的少数下游任务。首先,我们比较两种流行的更新方法,即全面微调(即更新整个网络、FT)和线性调查(即仅更新线性分类器,LP)。我们发现LP比FT要好得多,样本很少,而FT则在培训样本增加LP。接下来,我们显示数据增强性不能保证少量的改进性能和调查数据增强性能的有效性,根据加固度预测的强度、升级性能测试,最后我们采用加固度测试。