In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs. In this paper, we first disclose an actual predicament for this typical usage that it can not scale up with training data due to context length restriction. Besides, existing works have shown that ICL also suffers from various biases and requires delicate calibration treatment. To address both challenges, we advocate a simple and effective solution, $k$NN Prompting, which first queries LLM with training data for distributed representations, then predicts test instances by simply referring to nearest neighbors. We conduct comprehensive experiments to demonstrate its two-fold superiority: 1) Calibration-Free: $k$NN Prompting does not directly align LLM output distribution with task-specific label space, instead leverages such distribution to align test and training instances. It significantly outperforms state-of-the-art calibration-based methods under comparable few-shot scenario. 2) Beyond-Context: $k$NN Prompting can further scale up effectively with as many training data as are available, continually bringing substantial improvements. The scaling trend holds across 10 orders of magnitude ranging from 2 shots to 1024 shots as well as different LLMs scales ranging from 0.8B to 30B. It successfully bridges data scaling into model scaling, and brings new potentials for the gradient-free paradigm of LLM deployment. Code is publicly available.
翻译:在上下文学习(ICL)中,将目标任务公式化为在上下文演示的条件下完成提示已成为LLM的主要用途。在本文中,我们首先揭示了这种典型用法的一个实际问题,即由于上下文长度限制,它无法随着训练数据扩大而扩大。此外,现有的研究表明,ICL还面临各种偏差,并需要精细的校准处理。为了解决这两个挑战,我们提出了一种简单而有效的解决方案$k$NN提示,它首先通过训练数据查询LLM以获取分布式表示,然后通过简单地参考最近邻居预测测试实例。我们进行了全面的实验,证明了它的双重优势:1)无需校准:$k$NN提示不直接将LLM输出分布与任务特定的标签空间对齐,而是利用这种分布将测试和训练实例对齐。它在可比较的少样本场景下显着优于最先进的基于校准的方法。2)超越上下文:$k$NN提示可以进一步有效地扩展到尽可能多的训练数据,持续带来实质性的改进。该缩放趋势涵盖了10个数量级,范围从2次到1024次,以及从0.8B到30B的不同LLM规模。它成功地将数据扩展桥接到模型扩展,并为LLM部署的无梯度范例带来了新的潜力,代码公开可用。