Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with "task descriptions" in natural language (e.g., Radford et al., 2019). While this approach underperforms its supervised counterpart, we show in this work that the two ideas can be combined: We introduce Pattern-Exploiting Training (PET), a semi-supervised training procedure that reformulates input examples as cloze-style phrases to help language models understand a given task. These phrases are then used to assign soft labels to a large set of unlabeled examples. Finally, standard supervised training is performed on the resulting training set. For several tasks and languages, PET outperforms supervised training and strong semi-supervised approaches in low-resource settings by a large margin.
翻译:国家劳工规划方案的一些任务可以完全不受监督地以完全不受监督的方式解决,办法是提供一种用自然语言(例如,Radford等人,2019年)提供“任务描述”的预先培训语言模型。虽然这种方法在监督对应方方面表现不佳,但我们在这项工作中表明,这两种想法可以结合起来:我们引入模式-开发培训(PET),这是一个半监督的培训程序,将输入示例重新作为凝块式词组,以帮助语言模型理解某项任务。这些词组随后被用来为一大批未标注的例子分配软标签。最后,标准监督培训是在由此产生的培训中进行。对于一些任务和语言,PET将监督培训作为监督培训和在低资源环境下以大幅度进行强力的半监督方法。