Instruction learning of Large Language Models (LLMs) has enabled zero-shot task generalization. However, instruction learning has been predominantly approached as a fine-tuning problem, including instruction tuning and reinforcement learning from human feedback, where LLMs are multi-task fine-tuned on various tasks with instructions. In this paper, we present a surprising finding that applying in-context learning to instruction learning, referred to as In-Context Instruction Learning (ICIL), significantly improves the zero-shot task generalization performance for both pretrained and instruction-fine-tuned models. One of the core advantages of ICIL is that it uses a single fixed prompt to evaluate all tasks, which is a concatenation of cross-task demonstrations. In particular, we demonstrate that the most powerful instruction-fine-tuned baseline (text-davinci-003) also benefits from ICIL by 9.3%, indicating that the effect of ICIL is complementary to instruction-based fine-tuning.
翻译:大语言模式教学(LLMs)的教学使得对大语言模式的教学得以概括化。然而,教学学习主要是一个微调问题,包括从人类反馈中进行教学调整和强化学习,而LLMs是用指示对各种任务进行微调的多重任务。在本文中,我们提出了一个令人惊讶的发现,将内文学习应用到教学学习,称为In-Context 教学(ICIL),大大改进了预先训练模式和指示-fine调制模式的零光化任务性能。ICIL的核心优势之一是它使用单一固定的快速来评价所有任务,这是跨任务演示的组合。特别是,我们证明最强大的指示-fine调制基线(tle-davinci-003)也从ICL(Intext-davinci-003)中获益9.3%,这表明ICIL的作用是对基于指示的微调的补充。</s>