Large language models have exhibited intriguing in-context learning capability, achieving promising zero- and few-shot performance without updating the parameters. However, conventional in-context learning is usually restricted by length constraints, rendering it ineffective to absorb supervision from a large number of examples. In order to go beyond few shots, we introduce structured prompting that breaks the length limit and scales in-context learning to thousands of examples. Specifically, demonstration examples are separately encoded with well-designed position embeddings, and then they are jointly attended by the test example using a rescaled attention mechanism. So we can scale the number of exemplars with linear complexity instead of quadratic complexity with respect to length. Experimental results on a diverse set of tasks show that our approach improves end-task performance and reduces evaluation variance over conventional in-context learning as the number of demonstration examples increases. Code has been released at https://aka.ms/structured-prompting.
翻译:大型语言模型表现出了令人感兴趣的文字学习能力,在不更新参数的情况下实现了有希望的零和少见的性能,但是,传统的文字学习通常受到长度限制的限制,无法从大量例子中吸收监督。为了超越几个例子,我们引入了结构化的提示,打破文字学习的长度限制和尺度,以数千个例子为例。具体地说,示范实例用精心设计的位置嵌入单独编码,然后使用重新标定的注意机制,由试验示例共同参与。因此,我们可以扩大线性复杂程度的外表外表数目,而不是长度的二次复杂程度。 一系列任务的实验结果表明,随着示范实例数量的增加,我们的方法可以改进最终任务的业绩,并减少常规文字学习的评价差异。在https://aka.ms/结构化-prompting上发布了代码。