The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study few-shot learning in a more practical scenario, where we use smaller language models for which fine-tuning is computationally efficient. We present LM-BFF--better few-shot fine-tuning of language models--a suite of simple and complementary techniques for fine-tuning language models on a small number of annotated examples. Our approach includes (1) prompt-based fine-tuning together with a novel pipeline for automating prompt generation; and (2) a refined strategy for dynamically and selectively incorporating demonstrations into each context. Finally, we present a systematic evaluation for analyzing few-shot performance on a range of NLP tasks, including classification and regression. Our experiments demonstrate that our methods combine to dramatically outperform standard fine-tuning procedures in this low resource setting, achieving up to 30% absolute improvement, and 11% on average across all tasks. Our approach makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning.
翻译:最近的GPT-3模型(Brown等人,2020年)仅通过利用一种自然语言快速和少数任务演示作为投入背景,取得了惊人的微小成绩。根据这些模型的研究结果,我们研究在更实际的情景下进行的微小语言模型,其中我们使用微调效率的较小语言模型进行微调。我们介绍了语言模型的LM-BFF(Brown等人,2020年)优调语言模型的精细微微微调微调微的一套简单和补充技术,仅以少量附加注释的例子为例。我们的方法包括:(1) 快速微调,并配之以新型管道,使快速生成自动化;(2) 将演示动态和有选择地纳入每种背景的精细化战略。最后,我们提出系统评价,分析低资源模型的一系列任务中的微表现,包括分类和回归。我们的实验表明,我们的方法在这种低资源设置中大大优于标准的微调程序,达到30%的绝对改进,在所有任务中平均为11%。我们的方法对任务资源和域专长的自动化作了最起码的假设,因此形成了一个强有力的任务分析方法。