Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text Sequence-to-Sequence Modeling - 专知论文

会员服务 ·

0

剪枝 · MoDELS · 层 · LORA · Integration ·

2023 年 5 月 19 日

Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text Sequence-to-Sequence Modeling

翻译：暂无翻译

Yunqi Zhu,Xuebing Yang,Yuanyuan Wu,Wensheng Zhang

The increasing size of language models raises great research interests in parameter-efficient fine-tuning such as LoRA that freezes the pre-trained model, and injects small-scale trainable parameters for multiple downstream tasks (e.g., summarization, question answering and translation). To further enhance the efficiency of fine-tuning, we propose a framework that integrates LoRA and structured layer pruning. The integrated framework is validated on two created deidentified medical report summarization datasets based on MIMIC-IV-Note and two public medical dialogue datasets. By tuning 0.6% parameters of the original model and pruning over 30% Transformer-layers, our framework can reduce 50% of GPU memory usage and speed up 100% of the training phase, while preserving over 92% generation qualities on free-text sequence-to-sequence tasks.

翻译：暂无翻译

0

相关内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

分层异构网络面向视频流的绿色节能通信研究

国家自然科学基金

6+阅读 · 2015年12月31日

分子光开关用于嵌段共聚物自组装纳米结构的超分辨荧光成像

国家自然科学基金

0+阅读 · 2014年12月31日

水稻籼粳杂种不育Sa复合体的分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向2-4μm拉曼光纤激光器的碲酸盐玻璃与光纤的制备与性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向多服务的可控内容分发网络关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

人工耳蜗植入者汉语普通话音调识别和音乐感知的试验研究

国家自然科学基金

0+阅读 · 2012年12月31日

异构多核可重构计算平台上面向服务的操作系统关键技术

国家自然科学基金

1+阅读 · 2012年12月31日

一次性量子计算

国家自然科学基金

1+阅读 · 2009年12月31日

多功能树状大分子包裹的纳米金颗粒作为平台用于癌细胞的CT成像研究

国家自然科学基金

0+阅读 · 2009年12月31日

纳米压电陶瓷水泥基复合材料微观结构及压电机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain

Arxiv

0+阅读 · 2023年7月6日

Art Authentication with Vision Transformers

Art Authentication with Vision Transformers

Arxiv

0+阅读 · 2023年7月6日

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

Arxiv

0+阅读 · 2023年7月5日

TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation

Arxiv

0+阅读 · 2023年7月3日

Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages

Arxiv

0+阅读 · 2023年7月3日

Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters

Arxiv

0+阅读 · 2023年7月2日

Modeling Parallel Programs using Large Language Models

Arxiv

0+阅读 · 2023年6月29日

Cross-Domain Few-Shot Graph Classification

Arxiv

13+阅读 · 2022年1月20日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Arxiv

10+阅读 · 2021年12月14日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain

Arxiv

0+阅读 · 2023年7月6日

Art Authentication with Vision Transformers

Art Authentication with Vision Transformers

Arxiv

0+阅读 · 2023年7月6日

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

Arxiv

0+阅读 · 2023年7月5日

TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation

Arxiv

0+阅读 · 2023年7月3日

Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages

Arxiv

0+阅读 · 2023年7月3日

Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters

Arxiv

0+阅读 · 2023年7月2日

Modeling Parallel Programs using Large Language Models

Arxiv

0+阅读 · 2023年6月29日

Cross-Domain Few-Shot Graph Classification

Arxiv

13+阅读 · 2022年1月20日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Arxiv

10+阅读 · 2021年12月14日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

相关基金

分层异构网络面向视频流的绿色节能通信研究

国家自然科学基金

6+阅读 · 2015年12月31日

分子光开关用于嵌段共聚物自组装纳米结构的超分辨荧光成像

国家自然科学基金

0+阅读 · 2014年12月31日

水稻籼粳杂种不育Sa复合体的分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向2-4μm拉曼光纤激光器的碲酸盐玻璃与光纤的制备与性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向多服务的可控内容分发网络关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

人工耳蜗植入者汉语普通话音调识别和音乐感知的试验研究

国家自然科学基金

0+阅读 · 2012年12月31日

异构多核可重构计算平台上面向服务的操作系统关键技术

国家自然科学基金

1+阅读 · 2012年12月31日

一次性量子计算

国家自然科学基金

1+阅读 · 2009年12月31日

多功能树状大分子包裹的纳米金颗粒作为平台用于癌细胞的CT成像研究

国家自然科学基金

0+阅读 · 2009年12月31日

纳米压电陶瓷水泥基复合材料微观结构及压电机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员