全局Prompt单元：一个有效Prompt的可移植控制模块 (Global Prompt Cell: A Portable Control Module for Effective Prompt) - 专知论文

会员服务 ·

0

Prompt · 微调 · 移植 · 单元 · 嵌入 ·

2023 年 4 月 12 日

Global Prompt Cell: A Portable Control Module for Effective Prompt

翻译：全局Prompt单元：一个有效Prompt的可移植控制模块

Chi Liu,Haochun Wang,Nuwa Xi,Sendong Zhao,Bing Qin

As a novel approach to tuning pre-trained models, prompt tuning involves freezing the parameters in downstream tasks while inserting trainable embeddings into inputs in the first layer.However,previous methods have mainly focused on the initialization of prompt embeddings. The question of how to train and utilize prompt embeddings in a reasonable way has become aa limiting factor in the effectiveness of prompt tuning. To address this issue, we introduce the Global Prompt Cell (GPC), a portable control module for prompt tuning that selectively preserves prompt information across all encoder layers. Our experimental results demonstrate a 5.8% improvement on SuperGLUE datasets compared to vanilla prompt tuning.

翻译：作为微调预训练模型的一种新方法，Prompt微调涉及在下游任务中冻结参数，在第一层中插入可训练的嵌入。然而，以前的方法主要集中在Prompt嵌入的初始化上。如何合理地训练和利用Prompt嵌入已成为影响Prompt微调有效性的限制因素。为解决这个问题，我们引入了全局Prompt单元（GPC），这是一个用于Prompt微调的可移植控制模块，可以选择地保留所有编码器层中的Prompt信息。我们的实验结果表明，在SuperGLUE数据集上，与普通Prompt微调相比，我们可以提高5.8％。

0

相关内容

Prompt

基于图神经网络的空间加速器可移植映射

基于图神经网络的空间加速器可移植映射

专知会员服务

6+阅读 · 2022年7月2日

《联合全域指挥与控制 (JADC2)》逻辑图

《联合全域指挥与控制 (JADC2)》逻辑图

专知会员服务

201+阅读 · 2022年6月8日

【CIKM2021】用户行为序列对比学习的上下文感知文档排序

专知会员服务

20+阅读 · 2021年8月30日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

专知会员服务

46+阅读 · 2020年1月11日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

R工程化—Rest API 之plumber包

R工程化—Rest API 之plumber包

R语言中文社区

11+阅读 · 2018年12月25日

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

机器学习研究会

50+阅读 · 2018年2月21日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

CD74调控乳腺癌细胞迁移的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

频谱异构环境下基于协作感知的认知无线ad hoc网络MAC技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

协同设计中敏感特征信息的处理方法与关键技术

国家自然科学基金

0+阅读 · 2014年12月31日

转录因子OsbZIPC调控水稻粒形的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于采样系统的网络控制系统的分析与设计

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

雷帕霉素通过Foxo3调节DCs耐受性抑制移植后肿瘤生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于AAO模板法的Ag纳米棒阵列的精细调控及其SERS响应特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

联合携PGE2基因骨髓间充质干细胞移植重塑Kuppfer细胞表型诱导大鼠肝移植术后免疫耐受

国家自然科学基金

0+阅读 · 2011年12月31日

Prompt-based Tuning of Transformer Models for Multi-Center Medical Image Segmentation

Arxiv

1+阅读 · 2023年5月30日

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark

Arxiv

0+阅读 · 2023年5月30日

Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning

Arxiv

0+阅读 · 2023年5月29日

MetaLR: Meta-tuning of Learning Rates for Transfer Learning in Medical Imaging

Arxiv

0+阅读 · 2023年5月29日

A Simple and Effective Framework for Strict Zero-Shot Hierarchical Classification

Arxiv

0+阅读 · 2023年5月26日

Visual Information Matters for ASR Error Correction

Arxiv

0+阅读 · 2023年5月26日

Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis

Arxiv

0+阅读 · 2023年5月26日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Self-Attention Graph Pooling

Self-Attention Graph Pooling

Arxiv

13+阅读 · 2019年6月13日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

相关VIP内容

基于图神经网络的空间加速器可移植映射

基于图神经网络的空间加速器可移植映射

专知会员服务

6+阅读 · 2022年7月2日

《联合全域指挥与控制 (JADC2)》逻辑图

《联合全域指挥与控制 (JADC2)》逻辑图

专知会员服务

201+阅读 · 2022年6月8日

【CIKM2021】用户行为序列对比学习的上下文感知文档排序

专知会员服务

20+阅读 · 2021年8月30日

预训练语言模型fine-tuning近期进展概述

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

专知会员服务

46+阅读 · 2020年1月11日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

中文版2500字 | 美陆军“项目融合”计划：推动大规模作战行动中的目标定位革新（附原文）

中文版9300字 | 《未来战场图景：智能无人技术驱动陆战领域》（附原文）

《重新思考军事战略：战斗消耗中暴露度研究》258页博士论文

《俄罗斯、人工智能与虚假信息战的未来》英智库最新报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

R工程化—Rest API 之plumber包

R工程化—Rest API 之plumber包

R语言中文社区

11+阅读 · 2018年12月25日

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

基于LSTM-CNN组合模型的Twitter情感分析（附代码）

机器学习研究会

50+阅读 · 2018年2月21日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Prompt-based Tuning of Transformer Models for Multi-Center Medical Image Segmentation

Arxiv

1+阅读 · 2023年5月30日

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark

Arxiv

0+阅读 · 2023年5月30日

Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning

Arxiv

0+阅读 · 2023年5月29日

MetaLR: Meta-tuning of Learning Rates for Transfer Learning in Medical Imaging

Arxiv

0+阅读 · 2023年5月29日

A Simple and Effective Framework for Strict Zero-Shot Hierarchical Classification

Arxiv

0+阅读 · 2023年5月26日

Visual Information Matters for ASR Error Correction

Arxiv

0+阅读 · 2023年5月26日

Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis

Arxiv

0+阅读 · 2023年5月26日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Self-Attention Graph Pooling

Self-Attention Graph Pooling

Arxiv

13+阅读 · 2019年6月13日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

CD74调控乳腺癌细胞迁移的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

频谱异构环境下基于协作感知的认知无线ad hoc网络MAC技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

协同设计中敏感特征信息的处理方法与关键技术

国家自然科学基金

0+阅读 · 2014年12月31日

转录因子OsbZIPC调控水稻粒形的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于采样系统的网络控制系统的分析与设计

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

雷帕霉素通过Foxo3调节DCs耐受性抑制移植后肿瘤生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于AAO模板法的Ag纳米棒阵列的精细调控及其SERS响应特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

联合携PGE2基因骨髓间充质干细胞移植重塑Kuppfer细胞表型诱导大鼠肝移植术后免疫耐受

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员