Recently, pre-trained language models (LMs) have achieved strong performance when fine-tuned on difficult benchmarks like SuperGLUE. However, performance can suffer when there are very few labeled examples available for fine-tuning. Pattern Exploiting Training (PET) is a recent approach that leverages patterns for few-shot learning. However, PET uses task-specific unlabeled data. In this paper, we focus on few-shot learning without any unlabeled data and introduce ADAPET, which modifies PET's objective to provide denser supervision during fine-tuning. As a result, ADAPET outperforms PET on SuperGLUE without any task-specific unlabeled data. Our code can be found at https://github.com/rrmenon10/ADAPET.
翻译:最近,经过培训的语文模式(LMS)在微调超级GLUE等困难基准时取得了良好的成绩。然而,当微调的标签例子很少时,业绩就会受到影响。模式开发培训(PET)是最近的一种方法,它利用模式模式进行微量学习。然而,PET使用特定任务未加标签的数据。在本文中,我们把重点放在没有未加标签的数据的微量学习上,并引入了ADAPET,它改变了PET在微调期间提供更密集监督的目标。结果,ADAPET在没有任何特定任务未加标签的数据的情况下,在超大GLUE上优异PET。我们的代码可以在 https://github.com/rmenon10ADAPET 上找到。