Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available. In this paper, we introduce Z-ICL, a new zero-shot method that closes the gap by constructing pseudo-demonstrations for a given test input using a raw text corpus. Concretely, pseudo-demonstrations are constructed by (1) finding the nearest neighbors to the test input from the corpus and pairing them with random task labels, and (2) applying a set of techniques to reduce the amount of direct copying the model does from the resulting demonstrations. Evaluation on nine classification datasets shows that Z-ICL outperforms previous zero-shot methods by a significant margin, and is on par with in-context learning with labeled training data in the few-shot setting. Overall, Z-ICL provides a significantly higher estimate of the zero-shot performance levels of a model, and supports future efforts to develop better pseudo-demonstrations that further improve zero-shot results.
翻译:虽然大语言模型可以用来进行零和零光的学习,但是在没有演示时性能会显著下降。 在本文中,我们引入了Z-ICL, 这是一种新的零光方法,通过使用原始文本材料为特定测试输入建立假演示来缩小差距。具体地说,伪示范的构建方式是:(1) 找到最接近的邻居,将其与物理的测试输入相匹配,并配上随机任务标签;(2) 应用一套技术,减少直接复制模型的数量,减少由此产生的演示结果。对九套分类数据集的评估表明,Z-ICL以显著的差幅完成了先前的零光方法,与文字学习相同,与几张镜头设置的标签培训数据相同。总体而言, Z-ICL对模型零光性能水平的估计要高得多,并支持今后努力开发更好的伪化验证,以进一步改善零光结果。