We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited. Rather than apply the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework. To create a classifier, we first prompt the model to answer multiple distinct queries about an example and define how the possible responses should be mapped to votes for labels and abstentions. We then denoise these noisy label sources using the Snorkel system and train an end classifier with the resulting training data. Our experimental evaluation shows that prompting large language models within a weak supervision framework can provide significant gains in accuracy. On the WRENCH weak supervision benchmark, this approach can significantly improve over zero-shot performance, an average 19.5% reduction in errors. We also find that this approach produces classifiers with comparable or superior accuracy to those trained from hand-engineered rules.
翻译:在标签培训数据有限的情况下,我们提出对新任务应用大型预先培训语言模式的新战略。我们不以典型的零射或几射方式应用该模式,而是将该模式作为在薄弱的监管框架内标签功能的基础。要创建一个分类器,我们首先促使该模式回答关于一个示例的多种不同的询问,并界定如何将可能的响应映射为标签和弃权的选票。我们随后利用Snorkel系统将这些吵闹的标签源填充,并用培训数据培训一个终端分类器。我们的实验评估表明,在薄弱的监管框架内推动大型语言模式可以带来显著的准确性。在WRENCH薄弱的监督基准上,这一方法可以大大改进零射效果,平均减少19.5%的错误。我们还发现,这一方法可以产生与从手工设计规则中培训的人员相近或更精确的分类器。