LLM-Adapter：适用于大型语言模型参数有效微调的适配器系列 (LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models) - 专知论文

会员服务 ·

0

适配 · 微调 · 数学推理 · 大型语言模型 · 语言模型 ·

2023 年 4 月 4 日

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models

翻译：LLM-Adapter：适用于大型语言模型参数有效微调的适配器系列

Zhiqiang Hu,Yihuai Lan,Lei Wang,Wanyu Xu,Ee-Peng Lim,Roy Ka-Wei Lee,Lidong Bing,Soujanya Poria

from arxiv, Technical Report. The code of our framework can be found at https://github.com/AGI-Edgerunners/LLM-Adapters. We will keep all of the code open-source and continue to update the framework with new adapters, LLMs, and tasks

The success of large language models (LLMs), like GPT-3 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by fine-tuning open-access LLMs with task-specific data (e.g., ChatDoctor) or instruction data (e.g., Alpaca). Among the various fine-tuning methods, adapter-based parameter-efficient fine-tuning (PEFT) is undoubtedly one of the most attractive topics, as it only requires fine-tuning a few external parameters instead of the entire LLMs while achieving comparable or even better performance. To enable further research on PEFT methods of LLMs, this paper presents LLM-Adapters, an easy-to-use framework that integrates various adapters into LLMs and can execute these adapter-based PEFT methods of LLMs for different tasks. The framework includes state-of-the-art open-access LLMs such as LLaMA, BLOOM, OPT, and GPT-J, as well as widely used adapters such as Series adapter, Parallel adapter, and LoRA. The framework is designed to be research-friendly, efficient, modular, and extendable, allowing the integration of new adapters and the evaluation of them with new and larger-scale LLMs. Furthermore, to evaluate the effectiveness of adapters in LLMs-Adapters, we conduct experiments on six math reasoning datasets. The results demonstrate that using adapter-based PEFT in smaller-scale LLMs (7B) with few extra trainable parameters yields comparable, and in some cases superior, performance to that of powerful LLMs (175B) in zero-shot inference on simple math reasoning datasets. Overall, we provide a promising framework for fine-tuning large LLMs on downstream tasks. We believe the proposed LLMs-Adapters will advance adapter-based PEFT research, facilitate the deployment of research pipelines, and enable practical applications to real-world systems.

翻译：摘要：大型语言模型（LLMs），如GPT-3和ChatGPT的成功，已经引发了众多成本效益高、易于访问的替代方案的开发，这些替代方案利用任务特定数据（例如ChatDoctor）或指令数据（例如Alpaca）对开放访问的LLMs进行微调。在各种微调方法中，适配器为基础的参数有效微调（PEFT）无疑是最有吸引力的话题之一，因为它在仅微调少量外部参数而非整个LLMs的同时，可实现相当甚至更好的性能。为了促进LLMs的PEFT方法的进一步研究，本文提出了LLM-Adapters，这是一个易于使用的框架，它将各种适配器集成到LLMs中，并可以针对不同任务执行基于适配器的LLMs的PEFT方法。该框架包括LLMs的最新开放访问版本，如LLaMA、BLOOM、OPT和GPT-J，以及广泛使用的适配器，如串行适配器、并行适配器和LoRA。该框架旨在友好、高效、模块化和可扩展，允许集成新的适配器并使用新的更大规模的LLMs进行评估。此外，为了评估LLMs-Adapters中适配器的有效性，我们在六个数学推理数据集上进行了实验。结果表明，在使用少量可训练参数的小规模LLMs（7B）中使用基于适配器的PEFT，在简单的数学推理数据集上实现了与强大的LLMs（175B）零-shot推断相当甚至更好的性能。总的来说，我们提供了一个有前景的框架，用于将大型LLMs微调到下游任务上。我们相信提出的LLMs-Adapters将推动基于适配器的PEFT研究，促进研究流程的部署，并使实际应用到实际系统成为可能。

0

相关内容

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

专知会员服务

87+阅读 · 2023年5月10日

大模型如何端边部署？华盛顿Google提出《逐步蒸馏》法，以更少的训练数据和更小的模型规模超越更大的语言模型

大模型如何端边部署？华盛顿Google提出《逐步蒸馏》法，以更少的训练数据和更小的模型规模超越更大的语言模型

专知会员服务

78+阅读 · 2023年5月8日

PubMed GPT ：用于生物医学文本的特定领域大型语言模型

PubMed GPT ：用于生物医学文本的特定领域大型语言模型

专知会员服务

38+阅读 · 2022年12月19日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

机器之心

7+阅读 · 2023年4月9日

从此告别繁琐的模型微调，LLM-Adapters助力NLP任务快速高效微调！

从此告别繁琐的模型微调，LLM-Adapters助力NLP任务快速高效微调！

PaperWeekly

2+阅读 · 2023年4月6日

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

PaperWeekly

0+阅读 · 2022年11月9日

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

机器之心

0+阅读 · 2022年11月7日

打破不可能三角、比肩5400亿模型，IDEA封神榜团队仅2亿级模型达到零样本学习SOTA

打破不可能三角、比肩5400亿模型，IDEA封神榜团队仅2亿级模型达到零样本学习SOTA

机器之心

0+阅读 · 2022年10月25日

打开模型Zero-Shot新范式：Instruction Tuning

打开模型Zero-Shot新范式：Instruction Tuning

PaperWeekly

2+阅读 · 2022年8月25日

Ladder Side-Tuning：预训练模型的“过墙梯”

Ladder Side-Tuning：预训练模型的“过墙梯”

PaperWeekly

0+阅读 · 2022年6月24日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

地西他滨诱导人调节性gammadeltaT细胞Foxp3基因表达上调的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

麦冬皂苷通过下调lnc-MALAT1抑制NSCLC血管生成的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

superstrate结构铜锌硒硫太阳电池制备中的关键科学问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

核酸适配体aptamer原位募集骨髓间充质干细胞在兔胫骨缺损修复中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

电磁场特征值问题的间断 Galerkin 算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

ATLAS上Higgs衰变到双玻色子WW的寻找

国家自然科学基金

0+阅读 · 2012年12月31日

地表水文循环参数化对植被冠层动态过程模拟的影响及不确定性分析

国家自然科学基金

0+阅读 · 2012年12月31日

探索ERG变异体维持“#33258;组装”#24037;程化软骨永久性表型的生物学性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Automatic Model Selection with Large Language Models for Reasoning

Arxiv

0+阅读 · 2023年5月23日

QLoRA: Efficient Finetuning of Quantized LLMs

Arxiv

1+阅读 · 2023年5月23日

CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models

Arxiv

0+阅读 · 2023年5月22日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective

Arxiv

0+阅读 · 2023年5月22日

Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text Sequence-to-Sequence Modeling

Arxiv

0+阅读 · 2023年5月19日

Differentially Private Adapters for Parameter Efficient Acoustic Modeling

Arxiv

0+阅读 · 2023年5月19日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

大型语言模型

相关VIP内容

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

专知会员服务

87+阅读 · 2023年5月10日

大模型如何端边部署？华盛顿Google提出《逐步蒸馏》法，以更少的训练数据和更小的模型规模超越更大的语言模型

大模型如何端边部署？华盛顿Google提出《逐步蒸馏》法，以更少的训练数据和更小的模型规模超越更大的语言模型

专知会员服务

78+阅读 · 2023年5月8日

PubMed GPT ：用于生物医学文本的特定领域大型语言模型

PubMed GPT ：用于生物医学文本的特定领域大型语言模型

专知会员服务

38+阅读 · 2022年12月19日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

首次：微软用GPT-4做大模型指令微调，新任务零样本性能再提升

机器之心

7+阅读 · 2023年4月9日

从此告别繁琐的模型微调，LLM-Adapters助力NLP任务快速高效微调！

从此告别繁琐的模型微调，LLM-Adapters助力NLP任务快速高效微调！

PaperWeekly

2+阅读 · 2023年4月6日

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

NeurlPS 2022 | 全新大模型参数高效微调方法：仅需训练0.3M的参数

PaperWeekly

0+阅读 · 2022年11月9日

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

NeurlPS 2022 | 全新大模型参数高效微调方法SSF：仅需训练0.3M的参数，效果卓越

机器之心

0+阅读 · 2022年11月7日

打破不可能三角、比肩5400亿模型，IDEA封神榜团队仅2亿级模型达到零样本学习SOTA

打破不可能三角、比肩5400亿模型，IDEA封神榜团队仅2亿级模型达到零样本学习SOTA

机器之心

0+阅读 · 2022年10月25日

打开模型Zero-Shot新范式：Instruction Tuning

打开模型Zero-Shot新范式：Instruction Tuning

PaperWeekly

2+阅读 · 2022年8月25日

Ladder Side-Tuning：预训练模型的“过墙梯”

Ladder Side-Tuning：预训练模型的“过墙梯”

PaperWeekly

0+阅读 · 2022年6月24日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

相关论文

Automatic Model Selection with Large Language Models for Reasoning

Arxiv

0+阅读 · 2023年5月23日

QLoRA: Efficient Finetuning of Quantized LLMs

Arxiv

1+阅读 · 2023年5月23日

CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models

Arxiv

0+阅读 · 2023年5月22日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective

Arxiv

0+阅读 · 2023年5月22日

Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text Sequence-to-Sequence Modeling

Arxiv

0+阅读 · 2023年5月19日

Differentially Private Adapters for Parameter Efficient Acoustic Modeling

Arxiv

0+阅读 · 2023年5月19日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Survey of Domain Adaptation for Neural Machine Translation

Arxiv

17+阅读 · 2018年6月1日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

地西他滨诱导人调节性gammadeltaT细胞Foxp3基因表达上调的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

麦冬皂苷通过下调lnc-MALAT1抑制NSCLC血管生成的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

superstrate结构铜锌硒硫太阳电池制备中的关键科学问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

核酸适配体aptamer原位募集骨髓间充质干细胞在兔胫骨缺损修复中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子群与Tewilliger代数的相关问题研究

国家自然科学基金

1+阅读 · 2013年12月31日

电磁场特征值问题的间断 Galerkin 算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

ATLAS上Higgs衰变到双玻色子WW的寻找

国家自然科学基金

0+阅读 · 2012年12月31日

地表水文循环参数化对植被冠层动态过程模拟的影响及不确定性分析

国家自然科学基金

0+阅读 · 2012年12月31日

探索ERG变异体维持“#33258;组装”#24037;程化软骨永久性表型的生物学性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员