模型组合,而不是即时聚合:用于微速快速调试的样本特定知识转让方法 (Model ensemble instead of prompt fusion: a sample-specific knowledge transfer method for few-shot prompt tuning) - 专知论文

会员服务 ·

0

Prompt · MoDELS · tuning · 集成 · 小样本学习 ·

2022 年 11 月 20 日

Model ensemble instead of prompt fusion: a sample-specific knowledge transfer method for few-shot prompt tuning

翻译：模型组合,而不是即时聚合:用于微速快速调试的样本特定知识转让方法

Xiangyu Peng,Chen Xing,Prafulla Kumar Choubey,Chien-Sheng Wu,Caiming Xiong

Prompt tuning approaches, which learn task-specific soft prompts for a downstream task conditioning on frozen pre-trained models, have attracted growing interest due to its parameter efficiency. With large language models and sufficient training data, prompt tuning performs comparably to full-model tuning. However, with limited training samples in few-shot settings, prompt tuning fails to match the performance of full-model fine-tuning. In this work, we focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks. Recognizing the good generalization capabilities of ensemble methods in low-data regime, we first experiment and show that a simple ensemble of model predictions based on different source prompts, outperforms existing multi-prompt knowledge transfer approaches such as source prompt fusion in the few-shot setting. Motivated by this observation, we further investigate model ensembles and propose Sample-specific Ensemble of Source Models (SESoM). SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs. Through this way, SESoM inherits the superior generalization of model ensemble approaches and simultaneously captures the sample-specific competence of each source prompt. We conduct experiments across a diverse set of eight NLP tasks using models of different scales (T5-{base, large, XL}) and find that SESoM consistently outperforms the existing models of the same as well as larger parametric scale by a large margin.

翻译：快速调试方法学习特定任务软提示,以适应冻结的预训练模型的下游任务,因其参数效率而引起越来越多的兴趣。有了大型语言模型和足够的培训数据,快速调试的功能可以与全模调相匹配。然而,由于培训样本有限,在几发环境中,快速调试无法与全模微调的性能相匹配。在这项工作中,我们侧重于通过从软源任务提示中传输知识来改进快速调试的微小性能。认识到低数据系统中合用的方法具有很好的概括性能力,我们首先试验并显示,基于不同来源提示的模型预测的简单组合,优于现有的多样性知识传输方法,如在几发环境中的源代码提示。根据这一观察,我们进一步调查模型的组合,并提议通过从软源模型中传输知识来快速调试。SESOM学会在使用不同源模型输出时,对每个不同样样样样样样模型的不同来源的贡献做出调整。通过这一方法,SEMAR 将每个大样级模型的精细度的精度作为整个模型的精度的精度的精度模型的精度。

0

相关内容

Prompt

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

LncRNA介导肿瘤相关巨噬细胞促进乳腺癌转移分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

肿瘤相关巨噬细胞靶向纳米药物在非小细胞肺癌治疗中的研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

间充质干细胞对肝癌发生中Kupffer细胞相关炎症反应的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

TAM/Gas6在石英粉尘致炎性反应及纤维化中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

TREM-1/DAP12/ NF-κB信号通路在6-姜烯酚抗动脉粥样硬化中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

NFκB信号通路调节巨噬细胞胆固醇平衡在尿毒症性动脉粥样硬化发病机制中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤相关巨噬细胞来源的CCL18促进乳腺癌EMT的研究

国家自然科学基金

0+阅读 · 2011年12月31日

巨噬细胞MsrA的靶向调控对动脉粥样硬化的干预研究

国家自然科学基金

0+阅读 · 2009年12月31日

Exploiting Style Transfer-based Task Augmentation for Cross-Domain Few-Shot Learning

Arxiv

1+阅读 · 2023年1月19日

Multimodal Video Adapter for Parameter Efficient Video Text Retrieval

Arxiv

0+阅读 · 2023年1月19日

Cross-Domain Evaluation of a Deep Learning-Based Type Inference System

Arxiv

0+阅读 · 2023年1月18日

Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining

Arxiv

0+阅读 · 2023年1月18日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

VIP会员

文章信息

相关主题

小样本学习

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Exploiting Style Transfer-based Task Augmentation for Cross-Domain Few-Shot Learning

Arxiv

1+阅读 · 2023年1月19日

Multimodal Video Adapter for Parameter Efficient Video Text Retrieval

Arxiv

0+阅读 · 2023年1月19日

Cross-Domain Evaluation of a Deep Learning-Based Type Inference System

Arxiv

0+阅读 · 2023年1月18日

Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining

Arxiv

0+阅读 · 2023年1月18日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

相关基金

LncRNA介导肿瘤相关巨噬细胞促进乳腺癌转移分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

肿瘤相关巨噬细胞靶向纳米药物在非小细胞肺癌治疗中的研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

间充质干细胞对肝癌发生中Kupffer细胞相关炎症反应的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

TAM/Gas6在石英粉尘致炎性反应及纤维化中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

TREM-1/DAP12/ NF-κB信号通路在6-姜烯酚抗动脉粥样硬化中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

NFκB信号通路调节巨噬细胞胆固醇平衡在尿毒症性动脉粥样硬化发病机制中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤相关巨噬细胞来源的CCL18促进乳腺癌EMT的研究

国家自然科学基金

0+阅读 · 2011年12月31日

巨噬细胞MsrA的靶向调控对动脉粥样硬化的干预研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员