参数-有效精密设计空间 (Parameter-Efficient Fine-Tuning Design Spaces) - 专知论文

会员服务 ·

0

tuning · 设计 · 设计模式 · 层 · LORA ·

2023 年 1 月 4 日

Parameter-Efficient Fine-Tuning Design Spaces

翻译：参数-有效精密设计空间

Jiaao Chen,Aston Zhang,Xingjian Shi,Mu Li,Alex Smola,Diyi Yang

from arxiv, Code is available at https://github.com/amazon-science/peft-design-spaces

Parameter-efficient fine-tuning aims to achieve performance comparable to fine-tuning, using fewer trainable parameters. Several strategies (e.g., Adapters, prefix tuning, BitFit, and LoRA) have been proposed. However, their designs are hand-crafted separately, and it remains unclear whether certain design patterns exist for parameter-efficient fine-tuning. Thus, we present a parameter-efficient fine-tuning design paradigm and discover design patterns that are applicable to different experimental settings. Instead of focusing on designing another individual tuning strategy, we introduce parameter-efficient fine-tuning design spaces that parameterize tuning structures and tuning strategies. Specifically, any design space is characterized by four components: layer grouping, trainable parameter allocation, tunable groups, and strategy assignment. Starting from an initial design space, we progressively refine the space based on the model quality of each design choice and make greedy selection at each stage over these four components. We discover the following design patterns: (i) group layers in a spindle pattern; (ii) allocate the number of trainable parameters to layers uniformly; (iii) tune all the groups; (iv) assign proper tuning strategies to different groups. These design patterns result in new parameter-efficient fine-tuning methods. We show experimentally that these methods consistently and significantly outperform investigated parameter-efficient fine-tuning strategies across different backbone models and different tasks in natural language processing.

翻译：参数效率微调的目的是利用较少的可训练参数实现与微调相近的性能,使用较少的可训练参数; 提出了若干战略(例如适应器、前缀调制、BitFit和LORA), 然而,它们的设计是手工制作的, 并且仍然不清楚某些设计模式是否存在, 用于参数效率微调。因此, 我们提出了一个具有参数效率的微调设计模式, 并发现适用于不同实验环境的设计模式。我们不是侧重于设计另一个单独的调试战略, 而是采用具有参数效率的微调设计空间, 以对调结构和调控战略进行比对。具体地说, 任何设计空间的特点是四个组成部分: 层组合、可训练参数分配、金枪鱼组合和战略分配。从最初的设计空间开始, 我们逐步改进每个设计选择的模型质量, 并在每个阶段对这四个组成部分作出贪婪的选择。我们发现以下的设计模式:(一) 串调结构中的组级级结构;(二) 将可训练参数的数量分配给各层次的统一;(三) 调整所有组合的微; (四) 正确调整所有组合; 适当调整各种的自然调制的参数。

0

相关内容

tuning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Parameter-Efficient Fine-tuning 相关工作梳理

Parameter-Efficient Fine-tuning 相关工作梳理

PaperWeekly

1+阅读 · 2022年3月19日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于天然产物Aspernigerin的新型几丁质合成抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Trop2对CBSCs移植修复梗死心肌的影响及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

从ERK1/2和p38信号通路及其交互作用研究MEBT/MEBO促进慢性难愈合创面修复的机制

国家自然科学基金

0+阅读 · 2012年12月31日

纳米限域体系中杂多酸催化酮类Baeyer-Villiger氧化反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

Intermedin-53在心肌肥厚中的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

空位掺杂对Heusler合金磁性和电子结构的调控

国家自然科学基金

0+阅读 · 2009年12月31日

An efficient neural-network and finite-difference hybrid method for elliptic interface problems with applications

Arxiv

0+阅读 · 2023年3月3日

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Arxiv

0+阅读 · 2023年3月2日

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

Arxiv

0+阅读 · 2023年3月2日

Fast and scalable computation of shape-morphing nonlinear solutions with application to evolutional neural networks

Arxiv

0+阅读 · 2023年3月2日

Rethinking Efficient Tuning Methods from a Unified Perspective

Arxiv

0+阅读 · 2023年3月1日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Parameter-Efficient Fine-tuning 相关工作梳理

Parameter-Efficient Fine-tuning 相关工作梳理

PaperWeekly

1+阅读 · 2022年3月19日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

An efficient neural-network and finite-difference hybrid method for elliptic interface problems with applications

Arxiv

0+阅读 · 2023年3月3日

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Arxiv

0+阅读 · 2023年3月2日

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

Arxiv

0+阅读 · 2023年3月2日

Fast and scalable computation of shape-morphing nonlinear solutions with application to evolutional neural networks

Arxiv

0+阅读 · 2023年3月2日

Rethinking Efficient Tuning Methods from a Unified Perspective

Arxiv

0+阅读 · 2023年3月1日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

ResMLP: Feedforward networks for image classification with data-efficient training

Arxiv

12+阅读 · 2021年5月7日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于天然产物Aspernigerin的新型几丁质合成抑制剂的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Trop2对CBSCs移植修复梗死心肌的影响及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

"β-hCG-ERK1/2-MMP-2"信号通路在卵巢癌侵袭、转移中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

从ERK1/2和p38信号通路及其交互作用研究MEBT/MEBO促进慢性难愈合创面修复的机制

国家自然科学基金

0+阅读 · 2012年12月31日

纳米限域体系中杂多酸催化酮类Baeyer-Villiger氧化反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

Intermedin-53在心肌肥厚中的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

空位掺杂对Heusler合金磁性和电子结构的调控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员