Prompt tuning approaches, which learn task-specific soft prompts for a downstream task conditioning on frozen pre-trained models, have attracted growing interest due to its parameter efficiency. With large language models and sufficient training data, prompt tuning performs comparably to full-model tuning. However, with limited training samples in few-shot settings, prompt tuning fails to match the performance of full-model fine-tuning. In this work, we focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks. Recognizing the good generalization capabilities of ensemble methods in low-data regime, we first experiment and show that a simple ensemble of model predictions based on different source prompts, outperforms existing multi-prompt knowledge transfer approaches such as source prompt fusion in the few-shot setting. Motivated by this observation, we further investigate model ensembles and propose Sample-specific Ensemble of Source Models (SESoM). SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs. Through this way, SESoM inherits the superior generalization of model ensemble approaches and simultaneously captures the sample-specific competence of each source prompt. We conduct experiments across a diverse set of eight NLP tasks using models of different scales (T5-{base, large, XL}) and find that SESoM consistently outperforms the existing models of the same as well as larger parametric scale by a large margin.
翻译:快速调试方法学习特定任务软提示,以适应冻结的预训练模型的下游任务,因其参数效率而引起越来越多的兴趣。有了大型语言模型和足够的培训数据,快速调试的功能可以与全模调相匹配。然而,由于培训样本有限,在几发环境中,快速调试无法与全模微调的性能相匹配。在这项工作中,我们侧重于通过从软源任务提示中传输知识来改进快速调试的微小性能。认识到低数据系统中合用的方法具有很好的概括性能力,我们首先试验并显示,基于不同来源提示的模型预测的简单组合,优于现有的多样性知识传输方法,如在几发环境中的源代码提示。根据这一观察,我们进一步调查模型的组合,并提议通过从软源模型中传输知识来快速调试。SESOM学会在使用不同源模型输出时,对每个不同样样样样样样模型的不同来源的贡献做出调整。通过这一方法,SEMAR 将每个大样级模型的精细度的精度作为整个模型的精度的精度的精度模型的精度。</s>