Pre-trained large language models can efficiently interpolate human-written prompts in a natural way. Multitask prompted learning can help generalization through a diverse set of tasks at once, thus enhancing the potential for more effective downstream fine-tuning. To perform efficient multitask-inference in the same batch, parameter-efficient fine-tuning methods such as prompt tuning have been proposed. However, the existing prompt tuning methods may lack generalization. We propose SPT, a semi-parametric prompt tuning method for multitask prompted learning. The novel component of SPT is a memory bank from where memory prompts are retrieved based on discrete prompts. Extensive experiments, such as (i) fine-tuning a full language model with SPT on 31 different tasks from 8 different domains and evaluating zero-shot generalization on 9 heldout datasets under 5 NLP task categories and (ii) pretraining SPT on the GLUE datasets and evaluating fine-tuning on the SuperGLUE datasets, demonstrate effectiveness of SPT.
翻译:培训前的大型语言模型可以自然地有效地将人类写作的速率相互调试。多任务促学可以同时通过一系列不同的任务帮助普及,从而增强更有效地下游微调的潜力。为了在同一批中高效地执行多任务推导,已经提出了诸如快速调试等具有参数效率的微调方法。但是,现有的快速调试方法可能缺乏概括性。我们建议小组委员会,一种用于多任务促学的半参数快速调试方法。小组委员会的新组成部分是一个记忆库,根据离散的提示检索记忆提示。广泛的实验,例如(一) 与小组委员会一道对来自8个不同领域的31项不同任务进行微调,并对5个国家定位方案任务类别下的9个搁置数据集进行零弹分化评价,以及(二) 对小组委员会进行GLUE数据集的预先培训,并对SUUGLUE数据集进行微调,展示小组委员会的有效性。