One of the most impressive results of recent NLP history is the ability of pre-trained language models to solve new tasks in a zero-shot setting. To achieve this, NLP tasks are framed as natural language prompts, generating a response indicating the predicted output. Nonetheless, the performance in such settings often lags far behind its supervised counterpart, suggesting a large space for potential improvement. In this paper, we explore methods to utilize unlabeled data to improve zero-shot performance. Specifically, we take advantage of the fact that multiple prompts can be used to specify a single task, and propose to regularize prompt consistency, encouraging consistent predictions over this diverse set of prompts. Our method makes it possible to fine-tune the model either with extra unlabeled training data, or directly on test input at inference time in an unsupervised manner. In experiments, our approach outperforms the state-of-the-art zero-shot learner, T0 (Sanh et al., 2022), on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points in terms of accuracy. The gains are often attained with a small number of unlabeled examples.
翻译:近期国家学习计划史上最令人印象深刻的成果之一是,培训前语言模型有能力在零点环境下解决新任务。为此,国家学习计划的任务以自然语言提示的形式制定,产生一个反映预测产出的响应。然而,这种环境中的绩效往往远远落后于其监督的对应方,表明有巨大的潜在改进空间。在本文中,我们探索了利用未贴标签数据改善零点性能的方法。具体地说,我们利用多种提示可用于指定单一任务,并提议规范及时的一致性,鼓励对这一系列不同的提示作出一致的预测。我们的方法使得有可能用额外的未贴标签的培训数据来微调模型,或者以非超强的方式直接根据推断输入的测试数据进行微调。在实验中,我们的方法超越了最先进的零点学习者T0(Sanh等人,2022年)在4个国家学习计划任务中的11个数据集中的9个数据集,在精确度方面达到10.6个绝对点。