The success of large language models (LLMs), like GPT-3 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by fine-tuning open-access LLMs with task-specific data (e.g., ChatDoctor) or instruction data (e.g., Alpaca). Among the various fine-tuning methods, adapter-based parameter-efficient fine-tuning (PEFT) is undoubtedly one of the most attractive topics, as it only requires fine-tuning a few external parameters instead of the entire LLMs while achieving comparable or even better performance. To enable further research on PEFT methods of LLMs, this paper presents LLM-Adapters, an easy-to-use framework that integrates various adapters into LLMs and can execute these adapter-based PEFT methods of LLMs for different tasks. The framework includes state-of-the-art open-access LLMs such as LLaMA, BLOOM, OPT, and GPT-J, as well as widely used adapters such as Series adapter, Parallel adapter, and LoRA. The framework is designed to be research-friendly, efficient, modular, and extendable, allowing the integration of new adapters and the evaluation of them with new and larger-scale LLMs. Furthermore, to evaluate the effectiveness of adapters in LLMs-Adapters, we conduct experiments on six math reasoning datasets. The results demonstrate that using adapter-based PEFT in smaller-scale LLMs (7B) with few extra trainable parameters yields comparable, and in some cases superior, performance to that of powerful LLMs (175B) in zero-shot inference on simple math reasoning datasets. Overall, we provide a promising framework for fine-tuning large LLMs on downstream tasks. We believe the proposed LLMs-Adapters will advance adapter-based PEFT research, facilitate the deployment of research pipelines, and enable practical applications to real-world systems.
翻译:摘要:大型语言模型(LLMs),如GPT-3和ChatGPT的成功,已经引发了众多成本效益高、易于访问的替代方案的开发,这些替代方案利用任务特定数据(例如ChatDoctor)或指令数据(例如Alpaca)对开放访问的LLMs进行微调。在各种微调方法中,适配器为基础的参数有效微调(PEFT)无疑是最有吸引力的话题之一,因为它在仅微调少量外部参数而非整个LLMs的同时,可实现相当甚至更好的性能。为了促进LLMs的PEFT方法的进一步研究,本文提出了LLM-Adapters,这是一个易于使用的框架,它将各种适配器集成到LLMs中,并可以针对不同任务执行基于适配器的LLMs的PEFT方法。该框架包括LLMs的最新开放访问版本,如LLaMA、BLOOM、OPT和GPT-J,以及广泛使用的适配器,如串行适配器、并行适配器和LoRA。该框架旨在友好、高效、模块化和可扩展,允许集成新的适配器并使用新的更大规模的LLMs进行评估。此外,为了评估LLMs-Adapters中适配器的有效性,我们在六个数学推理数据集上进行了实验。结果表明,在使用少量可训练参数的小规模LLMs(7B)中使用基于适配器的PEFT,在简单的数学推理数据集上实现了与强大的LLMs(175B)零-shot推断相当甚至更好的性能。总的来说,我们提供了一个有前景的框架,用于将大型LLMs微调到下游任务上。我们相信提出的LLMs-Adapters将推动基于适配器的PEFT研究,促进研究流程的部署,并使实际应用到实际系统成为可能。