Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unseen task without any gradient-based training by feeding a small number of training examples as part of the input. ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. Parameter-efficient fine-tuning (e.g. adapter modules, prompt tuning, sparse update methods, etc.) offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs. Along the way, we introduce a new parameter-efficient fine-tuning method called (IA)$^3$ that scales activations by learned vectors, attaining stronger performance while only introducing a relatively tiny amount of new parameters. We also propose a simple recipe based on the T0 model called T-Few that can be applied to new tasks without task-specific tuning or modifications. We validate the effectiveness of T-Few on completely unseen tasks by applying it to the RAFT benchmark, attaining super-human performance for the first time and outperforming the state-of-the-art by 6% absolute. All of the code used in our experiments is publicly available.
翻译:文字学习(ICL)使预先培训的语言模型能够在不经过任何梯度培训的情况下完成以前未见的任务,而无需经过任何梯度培训,作为投入的一部分,输入少量培训实例。 ICL产生大量的计算、记忆和存储费用,因为每次作出预测都涉及处理所有培训实例。 参数效率微调(例如,适配模块、快速调试、稀疏更新方法等)提供了一个替代模式,其中培训了一小部分参数,以便能够执行新任务。 在本文中,我们严格比较少发的ICL和节能微调,并表明后者提供更好的准确性以及大大降低计算费用。 沿此方向,我们采用了一个新的参数效率微调方法,称为(IA)$3美元,通过学习的矢量启动尺度,获得更强的性能,而只是引入了相对微量的新参数。 我们还提议了一个简单的配方,即T0模型,可以适用于新的任务,而无需具体任务调整或修改,而参数高效的微调,并表明后者提供更精确的精确性能,我们通过SARFS-FS-S-Silvial AS-to the fal exal exal exal exal exal exal exal exal-forviolvial ex ex ex ex ex ex ex exactal exit thesilviolvilvilviolviolviolviolal 6-to the ex exal ex ex ex ex ex ex ex ex exit exitalitalital 6-tal 6-tal ex exitalit exitalital ex abal exital 6-to the to to to to to to to to to to to to to to to to to to the ex 6-sal 6-sal 6-sal-sal abal ex exactal-toal-sal-sal-sal-toal-sal-sal exal ex ex ex ex ex ex ex ex exal-toal-toal-toal-toal-toal-toal-toal-toal-toal