Using prompts to utilize language models to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm. Nonetheless, virtually all prompt-based methods are token-level, meaning they all utilize GPT's left-to-right language model or BERT's masked language model to perform cloze-style tasks. In this paper, we attempt to accomplish several NLP tasks in the zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models--Next Sentence Prediction (NSP). Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking with ease. Based on the characteristics of NSP-BERT, we offer several quick building templates for various downstream tasks. We suggest a two-stage prompt method for word sense disambiguation tasks in particular. Our strategies for mapping the labels significantly enhance the model's performance on sentence pair tasks. On the FewCLUE benchmark, our NSP-BERT outperforms other zero-shot methods on most of these tasks and comes close to the few-shot methods.
翻译:使用语言模型来完成各种下游任务(又称快速学习或快速学习)的速效运用语言模型,与培训前和微调范式相比,最近取得了显著的成功。然而,几乎所有基于快速的方法都是象征性的,意味着它们都使用GPT的左对右语言模型或BERT的蒙面语言模型来完成凝固式任务。在本文件中,我们试图利用RoBERTA和其他模式-下句预测(NSP)放弃的BERTE最初培训前任务,在零点情景中完成几项NLP任务。与象征性技术不同,我们基于判决的快速快速方法NSP-BERT不需要确定快速方法的长度或预期位置,从而使其能够处理诸如与容易连接的实体等任务。根据NSP-BERT的特性,我们为各种下游任务提供了几种快速构建模板。我们建议了一种两阶段的快速方法,用于语言意识模糊任务。我们绘制标签的战略大大加强了模型在NSP-BERDR最接近的版本方法上的绩效。