This work introduces ATTEMPT (Attentional Mixture of Prompt Tuning), a new modular, multi-task, and parameter-efficient language model (LM) tuning approach that combines knowledge transferred across different tasks via a mixture of soft prompts while keeping original LM unchanged. ATTEMPT interpolates a set of prompts trained on large-scale source tasks and a newly initialized target task prompt using instance-wise attention computed by a lightweight sub-network trained on multiple target tasks. ATTEMPT is parameter-efficient (e.g., updates 1,600 times fewer parameters than fine-tuning) and enables multi-task learning and flexible extensions; importantly, it is also more interpretable because it demonstrates which source tasks affect the final model decision on target tasks. Experimental results across 17 diverse datasets show that ATTEMPT improves prompt tuning by up to a 22% absolute performance gain and outperforms or matches fully fine-tuned or other parameter-efficient tuning approaches that use over ten times more parameters.
翻译:这项工作引入了ATTEMPT(Paintal Mixture of print Ting),这是一种新的模块、多任务和节参数语言模型(LM)调控方法,将通过软提示混合而转移的不同任务的知识结合起来,同时保持原始LM不变。 ATTEPT 将一组经过大规模源任务培训的提示和新启动的目标任务结合起来,利用经过多目标任务培训的轻量子网络计算出实例性关注。 ATTEPT具有参数效率(例如,更新比微调少1 600倍的参数),并且能够进行多任务学习和灵活的扩展; 重要的是,它也更容易被解释,因为它展示了哪些源任务影响目标任务的最后示范决定。 17个不同数据集的实验结果表明,ATTEPT通过22%的绝对性业绩增益和超越外形或匹配完全微调或其它节能调方法,这些方法使用十倍以上参数。