Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-shot learning without changing model parameters. However, as we show, fine-tuning an LLM on any specific task generally destroys its in-context ability. We discover an important cause of this loss, format specialization, where the model overfits to the format of the fine-tuned task and is unable to output anything beyond this format. We further show that format specialization happens at the beginning of fine-tuning. To solve this problem, we propose Prompt Tuning with MOdel Tuning (ProMoT), a simple yet effective two-stage fine-tuning framework that preserves in-context abilities of the pretrained model. ProMoT first trains a soft prompt for the fine-tuning target task, and then fine-tunes the model itself with this soft prompt attached. ProMoT offloads task-specific formats into the soft prompt that can be removed when doing other in-context tasks. We fine-tune mT5 XXL with ProMoT on natural language inference (NLI) and English-French translation and evaluate the in-context abilities of the resulting models on 8 different NLP tasks. ProMoT achieves similar performance on the fine-tuned tasks compared with vanilla fine-tuning, but with much less reduction of in-context learning performances across the board. More importantly, ProMoT shows remarkable generalization ability on tasks that have different formats, e.g. fine-tuning on a NLI binary classification task improves the model's in-context ability to do summarization (+0.53 Rouge-2 score compared to the pretrained model), making ProMoT a promising method to build general purpose capabilities such as grounding and reasoning into LLMs with small but high quality datasets. When extended to sequential or multi-task training, ProMoT can achieve even better out-of-domain generalization performance.
翻译:预选的大型语言模型(LLMS)是强大的文字学习者,他们能够在不改变模型参数的情况下完成微小的学习。 然而,正如我们所显示的那样,在任何特定任务上微调LLM, 微调一个LMM, 通常会破坏其内置能力。 我们发现这种损失的重要原因, 格式专业化, 模型会比微调任务的格式更适合微调任务的格式, 并且无法输出任何超出此格式的内容。 我们还显示格式专业化发生在微调的开始阶段。 为了解决这个问题, 我们建议用 Mudel 分类(ProMoT) 快速调调试(ProMoT) 来改进双阶段的微调(PromoT), 保持预选模式的文能力。 ProMoT首先为微调目标任务准备软调, 然后用软调的提示本身。 PromoTLL 将特定格式放到软调, 当做其他正版任务时可以删除。 我们微调的 mT5 Bloral- 和 ProMoT在自然语言(NLI) 的微调(NLIL) 翻译和英语能力进行更高级的升级的升级的变现任务中, 将微调制, 高级任务进行更高级的进度的进度, 和高级的读化。