快速ELECTRA:与有偏见的预培训模式的模拟学习很少 (Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models) - 专知论文

会员服务 ·

0

掩码语言模型化 · 小样本学习 · 学成 · 语言模型化 · 判别器 ·

2022 年 5 月 30 日

Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models

翻译：快速ELECTRA:与有偏见的预培训模式的模拟学习很少

Mengzhou Xia,Mikel Artetxe,Jingfei Du,Danqi Chen,Ves Stoyanov

Pre-trained masked language models successfully perform few-shot learning by formulating downstream tasks as text infilling. However, as a strong alternative in full-shot settings, discriminative pre-trained models like ELECTRA do not fit into the paradigm. In this work, we adapt prompt-based few-shot learning to ELECTRA and show that it outperforms masked language models in a wide range of tasks. ELECTRA is pre-trained to distinguish if a token is generated or original. We naturally extend that to prompt-based few-shot learning by training to score the originality of the target options without introducing new parameters. Our method can be easily adapted to tasks involving multi-token predictions without extra computation overhead. Analysis shows that ELECTRA learns distributions that align better with downstream tasks.

翻译：受过训练的隐蔽语言模型通过将下游任务作为文本填充方式,成功地完成了几张短片学习。但是,作为全镜头环境中的一个强有力的替代方法,像ELECTRA这样的具有歧视性的预先训练模型并不适合范例。在这项工作中,我们把基于快速的短片学习应用到ELECTRA, 并表明它在许多广泛的任务中优于隐蔽语言模型。 ELECTRA经过预先培训,可以辨别代号是生成还是原始。我们自然地将这一方法扩大到通过培训快速的短片学习,以便在不引入新参数的情况下对目标选项的原始性进行评分。我们的方法可以很容易地适应于涉及不增加计算间接费用的多点预测的任务。分析表明,ELECTRA学会了与下游任务更加一致的分布。

0

相关内容

掩码语言模型化

掩码语言模型化

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

专知会员服务

16+阅读 · 2019年12月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

用于痫样脑电在线检测的gm-C小波滤波器实现理论与方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

功能多孔有机骨架材料的结构设计与可控合成

国家自然科学基金

0+阅读 · 2015年12月31日

PSD-95/kalirin-7/Rac1信号通路在七氟烷致幼期大鼠远期学习记忆能力损害中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

针刺改善血管性痴呆大鼠认知功能的多巴胺/肾上腺素受体机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

与进气道相容的飞行器乘波-楔形前体气动特性及重构研究

国家自然科学基金

0+阅读 · 2012年12月31日

生物分子模拟中的PDE模型与高效计算

国家自然科学基金

0+阅读 · 2012年12月31日

平衡重式叉车底盘系统横向稳定性集成控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Rossby波产生纬向流的动力学机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

STT: Soft Template Tuning for Few-Shot Adaptation

Arxiv

0+阅读 · 2022年7月18日

IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

Arxiv

0+阅读 · 2022年7月17日

ELECTRA is a Zero-Shot Learner, Too

Arxiv

0+阅读 · 2022年7月17日

Compression of Generative Pre-trained Language Models via Quantization

Arxiv

1+阅读 · 2022年7月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

掩码语言模型化

小样本学习

语言模型化

相关VIP内容

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

专知会员服务

16+阅读 · 2019年12月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

浅聊对比学习（Contrastive Learning）第一弹

浅聊对比学习（Contrastive Learning）第一弹

PaperWeekly

0+阅读 · 2022年6月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

STT: Soft Template Tuning for Few-Shot Adaptation

Arxiv

0+阅读 · 2022年7月18日

IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

Arxiv

0+阅读 · 2022年7月17日

ELECTRA is a Zero-Shot Learner, Too

Arxiv

0+阅读 · 2022年7月17日

Compression of Generative Pre-trained Language Models via Quantization

Arxiv

1+阅读 · 2022年7月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

用于痫样脑电在线检测的gm-C小波滤波器实现理论与方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

功能多孔有机骨架材料的结构设计与可控合成

国家自然科学基金

0+阅读 · 2015年12月31日

PSD-95/kalirin-7/Rac1信号通路在七氟烷致幼期大鼠远期学习记忆能力损害中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

针刺改善血管性痴呆大鼠认知功能的多巴胺/肾上腺素受体机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

与进气道相容的飞行器乘波-楔形前体气动特性及重构研究

国家自然科学基金

0+阅读 · 2012年12月31日

生物分子模拟中的PDE模型与高效计算

国家自然科学基金

0+阅读 · 2012年12月31日

平衡重式叉车底盘系统横向稳定性集成控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Rossby波产生纬向流的动力学机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员