Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-tuning), which is not efficient, or only tune the last linear layer (linear probing), which suffers a significant accuracy drop compared to the full fine-tuning. In this paper, we propose a new parameter-efficient fine-tuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance of full fine-tuning. In this way, SSF also surprisingly outperforms other parameter-efficient fine-tuning approaches even with a smaller number of tunable parameters. Furthermore, different from some existing parameter-efficient fine-tuning methods (e.g., Adapter or VPT) that introduce the extra parameters and computational cost in the training and inference stages, SSF only adds learnable parameters during the training stage, and these additional parameters can be merged into the original pre-trained model weights via re-parameterization in the inference phase. With the proposed SSF, our model obtains 2.46% (90.72% vs. 88.54%) and 11.48% (73.10% vs. 65.57%) performance improvement on FGVC and VTAB-1k in terms of Top-1 accuracy compared to the full fine-tuning but only fine-tuning about 0.3M parameters. We also conduct amounts of experiments in various model families (CNNs, Transformers, and MLPs) and datasets. Results on 26 image classification datasets in total and 3 robustness & out-of-distribution datasets show the effectiveness of SSF. Code is available at https://github.com/dongzelian/SSF.
翻译:现有微调方法要么调整预训练模式(全微调)的所有参数,因为前者效率不高,要么只是调整最后一个线性层(线性试),与整个微调相比,该线性层的精确度下降幅度很大。在本文中,我们建议采用称为SSFF的新的具有参数效率的微调方法,即研究人员只需调整和改变通过预训练模式提取的深层功能,以赶上全面微调的性能。这样,SSF还惊人地优于其他具有较高参数效率的微调方法,即使有较少的金枪鱼可选参数。此外,一些现有的节能微调方法(例如,SDander 或VPT)与在培训和推断阶段引入额外参数和计算成本的节调方法不同。SSSSSFP, 只是在培训阶段通过重新校正的校正模型重量(在SFSFSFS阶段,我们模型获得2.46%的S-72%的数值,在STA值中为88.54%,在TA的精确度中,在S-5854%的数据的精确度中,在TFS.