用于理解自然语言的矢量量化输入-文字化软提示 (Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding) - 专知论文

会员服务 ·

0

Prompt · SOFT · 可理解性 · tuning · 可约的 ·

2022 年 5 月 23 日

Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding

翻译：用于理解自然语言的矢量量化输入-文字化软提示

Rishabh Bhardwaj,Amrita Saha,Steven C. H. Hoi

Prompt Tuning (PT) has been largely successful as a parameter-efficient way of conditioning large-scale pre-trained language models towards a downstream task. More recently, soft prompt tuning has aimed to learn a fixed set of task-specific continuous vectors, i.e., soft tokens that remain static across the task samples. However, a fixed prompt may not generalize well to the diverse kinds of inputs the task comprises. With this motivation, we propose a novel way of prompting, Vector-quantized Input-contextualized Prompt Tuning or VIP. Essentially, VIP focuses on two aspects i) input-adaptation: input-specific contextualization of the soft tokens; and ii) vector quantization: we pass the tokens through a quantizer which effectively reduces representation variance by sampling prompts from a compact latent space. Over a wide range of natural language understanding tasks (SuperGLUE, QA, Relation Classification, NER, NLI), our proposed VIP framework beats the PT model by a margin of 1.19\%. Additionally, on Out-of-domain QA and Multi-Task setups over 4 different tasks spanning over 12 domains, we find that VIP outperforms PT by 0.75\%.

翻译：快速调试(PT)在很大程度上作为一种节能的参数高效方式,使大规模预先培训的语言模式适应下游任务。最近,软快速调试旨在学习一套固定的任务特定连续矢量,即任务样品中保持静态的软象征物。然而,固定的快速调试可能没有很好地推广到任务所包含的各种投入。有了这一动机,我们提出了一种新型的促动方法,即矢量量化的输入-文字化快速调试或贵宾。基本上,贵宾侧重于两个方面:投入适应:一)投入特定背景化软象征物;二)矢量定量化:我们通过一个四分法通过一个四分法通过从一个紧凑的潜在空间取样来有效减少代表差异。在广泛的自然语言理解任务(SuperGLUE、QA、Relational 分类、NER、NLI)中,我们提议的贵宾框架比PT模式高出1.19分之差点。此外,在外方位QA和多式TFS-TA中,我们发现超过12个风险域。

0

相关内容

Prompt

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

功能性Zn(II)配合物及有机分子的合成及对硝基芳香爆炸物的荧光识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

ZnS-CuInS2-AgInS2固溶体纳米晶/MoS2复合物的结构调控及光催化产氢性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

miR-5591靶向AGER/ROS/JNK抑制MSCs氧化应激损伤在糖尿病创面修复中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于声发射信号特征的高速焊凝固热裂纹在线检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Cbl家族调控c-Met介导的非小细胞肺癌放疗抵抗机制的研究

国家自然科学基金

1+阅读 · 2014年12月31日

碳量子点/MoSe2纳米复合材料的构筑及其光催化性能

国家自然科学基金

0+阅读 · 2013年12月31日

重金属废水制备新型Ferrite/LDH纳米复合材料及其催化吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化铝膜包覆活性炭催化剂的制备与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

固载铜盐/离子液体催化剂的制备、结构及性能

国家自然科学基金

0+阅读 · 2009年12月31日

拋物奇异积分算子有界性及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Towards Better Understanding of Self-Supervised Representations

Arxiv

0+阅读 · 2022年7月6日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

42+阅读 · 2022年6月15日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美军“泰坦（TITAN）地面站目标系统”：是颠覆还是一场可预见的军事进步？

美空军指挥参谋学院 · 联合空中作战规划课程介绍（2025年） | 22页

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

北约第十七届（2025年）网络冲突国际会议论文集 | 272页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Towards Better Understanding of Self-Supervised Representations

Arxiv

0+阅读 · 2022年7月6日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

42+阅读 · 2022年6月15日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

113+阅读 · 2020年3月18日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

相关基金

功能性Zn(II)配合物及有机分子的合成及对硝基芳香爆炸物的荧光识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

ZnS-CuInS2-AgInS2固溶体纳米晶/MoS2复合物的结构调控及光催化产氢性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

miR-5591靶向AGER/ROS/JNK抑制MSCs氧化应激损伤在糖尿病创面修复中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于声发射信号特征的高速焊凝固热裂纹在线检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Cbl家族调控c-Met介导的非小细胞肺癌放疗抵抗机制的研究

国家自然科学基金

1+阅读 · 2014年12月31日

碳量子点/MoSe2纳米复合材料的构筑及其光催化性能

国家自然科学基金

0+阅读 · 2013年12月31日

重金属废水制备新型Ferrite/LDH纳米复合材料及其催化吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

氧化铝膜包覆活性炭催化剂的制备与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

固载铜盐/离子液体催化剂的制备、结构及性能

国家自然科学基金

0+阅读 · 2009年12月31日

拋物奇异积分算子有界性及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员