We present LLaMA-Adapter, a lightweight adaption method to efficiently fine-tune LLaMA into an instruction-following model. Using 52K self-instruct demonstrations, LLaMA-Adapter only introduces 1.2M learnable parameters upon the frozen LLaMA 7B model, and costs less than one hour for fine-tuning on 8 A100 GPUs. Specifically, we adopt a set of learnable adaption prompts, and prepend them to the input text tokens at higher transformer layers. Then, a zero-init attention mechanism with zero gating is proposed, which adaptively injects the new instructional cues into LLaMA, while effectively preserves its pre-trained knowledge. With efficient training, LLaMA-Adapter generates high-quality responses, comparable to Alpaca with fully fine-tuned 7B parameters. Furthermore, our approach can be simply extended to multi-modal input, e.g., images, for image-conditioned LLaMA, which achieves superior reasoning capacity on ScienceQA. We release our code at https://github.com/ZrrSkywalker/LLaMA-Adapter.
翻译:我们提出了LLaMA-Adapter,这是一种轻量级适配方法,用于将LLaMA高效地微调为指令跟随模型。使用52K个自我指导演示,LLaMA-Adapter仅在冻结的LLaMA 7B模型上引入1.2M个可学习参数,并且在8个A100 GPU上微调耗时不到一小时。具体而言,我们采用一组可学习的适应提示,并将它们预置在更高的变形器层的输入文本令牌之前。然后,提出了一种带有零门控的零初始化注意力机制,该机制自适应地将新的指令提示注入LLaMA中,同时有效地保留其预训练的知识。通过高效的训练,LLaMA-Adapter生成高质量的响应,与完全微调了7B参数的Alpaca相当。此外,我们的方法可以简单地扩展到多模式输入,例如图像,以获取基于图像的LLaMA,在ScienceQA上实现了优越的推理能力。我们在https://github.com/ZrrSkywalker/LLaMA-Adapter上发布了我们的代码。