Ahead-of-Time P-Tuning - 专知论文

会员服务 ·

0

推断 · 基准 · MoDELS · 语言模型化 · GLUE ·

2023 年 5 月 18 日

Ahead-of-Time P-Tuning

翻译：暂无翻译

Daniil Gavrilov,Nikita Balagansky

In this paper, we propose Ahead-of-Time (AoT) P-Tuning, a novel parameter-efficient fine-tuning method for pre-trained Language Models (LMs) that adds input-dependent bias before each Transformer layer. We evaluate AoT P-Tuning on GLUE and SuperGLUE benchmarking datasets using RoBERTa and DeBERTa models, showing that it outperforms BitFit and is comparable or better than other baseline methods for efficient fine-tuning. Additionally, we assess the inference overhead of AoT P-Tuning and demonstrate that it introduces negligible overhead compared to established baseline methods. Our method enables multi-task inference with a single backbone LM, making it a practical solution for real-world applications.

翻译：暂无翻译

0

相关内容

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

专知会员服务

54+阅读 · 2022年12月10日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

55+阅读 · 2020年3月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

山豆根中具有抗肿瘤活性的Cytisine-Pterocarpan型新骨架化合物的发现及其仿生合成研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于BODIPY的光活化探针及其在超高分辨荧光成像中的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

近红外p-型染料敏化剂的合成及其光解水制氢性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于三苯胺-吲哚[3,2-b]咔唑-富勒烯p-n型染料用于染料敏化太阳能电池的研究

国家自然科学基金

0+阅读 · 2013年12月31日

STAT/IRF-8通路在髓源性抑制细胞（MDSCs）诱导肝移植免疫耐受过程中的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

FTY720缓释膜复合iPS来源的dNSCs对脊髓损伤修复的作用和机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

高性能染料敏化太阳电池材料的创新设计与可控制备

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIV-1相关神经认知疾病可视化生物标记物3.0T磁共振多模态定量研究

国家自然科学基金

0+阅读 · 2011年12月31日

Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models

Arxiv

0+阅读 · 2023年7月5日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

0+阅读 · 2023年7月5日

Multilingual Controllable Transformer-Based Lexical Simplification

Arxiv

0+阅读 · 2023年7月5日

1M parameters are enough? A lightweight CNN-based model for medical image segmentation

Arxiv

0+阅读 · 2023年7月3日

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Arxiv

0+阅读 · 2023年7月2日

Less is More: Selective Layer Finetuning with SubTuning

Arxiv

0+阅读 · 2023年7月2日

Sharpness-Aware Minimization Alone can Improve Adversarial Robustness

Arxiv

0+阅读 · 2023年7月1日

Zero-shot Nuclei Detection via Visual-Language Pre-trained Models

Arxiv

0+阅读 · 2023年6月30日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

自监督学习在CV进展？何恺明等最新ECCV2022教程《自监督表示学习在计算机视觉》，全面讲述自监督视觉学习进展

专知会员服务

54+阅读 · 2022年12月10日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

55+阅读 · 2020年3月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models

Arxiv

0+阅读 · 2023年7月5日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

0+阅读 · 2023年7月5日

Multilingual Controllable Transformer-Based Lexical Simplification

Arxiv

0+阅读 · 2023年7月5日

1M parameters are enough? A lightweight CNN-based model for medical image segmentation

Arxiv

0+阅读 · 2023年7月3日

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

Arxiv

0+阅读 · 2023年7月2日

Less is More: Selective Layer Finetuning with SubTuning

Arxiv

0+阅读 · 2023年7月2日

Sharpness-Aware Minimization Alone can Improve Adversarial Robustness

Arxiv

0+阅读 · 2023年7月1日

Zero-shot Nuclei Detection via Visual-Language Pre-trained Models

Arxiv

0+阅读 · 2023年6月30日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

相关基金

山豆根中具有抗肿瘤活性的Cytisine-Pterocarpan型新骨架化合物的发现及其仿生合成研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于BODIPY的光活化探针及其在超高分辨荧光成像中的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

近红外p-型染料敏化剂的合成及其光解水制氢性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于三苯胺-吲哚[3,2-b]咔唑-富勒烯p-n型染料用于染料敏化太阳能电池的研究

国家自然科学基金

0+阅读 · 2013年12月31日

STAT/IRF-8通路在髓源性抑制细胞（MDSCs）诱导肝移植免疫耐受过程中的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

FTY720缓释膜复合iPS来源的dNSCs对脊髓损伤修复的作用和机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

高性能染料敏化太阳电池材料的创新设计与可控制备

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIV-1相关神经认知疾病可视化生物标记物3.0T磁共振多模态定量研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员