通过对数据集和快速收集的元调控,调整用于零拍学习的语言模式 (Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections) - 专知论文

会员服务 ·

0

语言模型化 · Prompt · Performer · MoDELS · 学成 ·

2021 年 8 月 26 日

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

翻译：通过对数据集和快速收集的元调控,调整用于零拍学习的语言模式

Ruiqi Zhong,Kristy Lee,Zheng Zhang,Dan Klein

from arxiv, EMNLP 2021, Findings

Large pre-trained language models (LMs) such as GPT-3 have acquired a surprising ability to perform zero-shot learning. For example, to classify sentiment without any training examples, we can "prompt" the LM with the review and the label description "Does the user like this movie?", and ask whether the next word is "yes" or "no". However, the next word prediction training objective is still misaligned with the target zero-shot learning objective. To address this weakness, we propose meta-tuning, which directly optimizes the zero-shot learning objective by fine-tuning pre-trained language models on a collection of datasets. We focus on classification tasks, and construct the meta-dataset by aggregating 43 existing datasets and annotating 441 label descriptions in a question-answering (QA) format. When evaluated on unseen tasks, meta-tuned models outperform a same-sized QA model and the previous SOTA zero-shot learning system based on natural language inference. Additionally, increasing parameter count from 220M to 770M improves AUC-ROC scores by 6.3%, and we forecast that even larger models would perform better. Therefore, measuring zero-shot learning performance on language models out-of-the-box might underestimate their true potential, and community-wide efforts on aggregating datasets and unifying their formats can help build models that answer prompts better.

翻译：GPT-3 等大型预先培训语言模型(LMS) 已经获得了一个令人惊讶的零点学习能力。例如, 将情绪分类而不做任何培训范例, 我们可以通过“ 快速” LM 进行“ 快速” LM 的审查和标签描述“ 用户喜欢这部电影”, 并询问下一个词是“ 是” 还是“ 不 ” 。但是, 下一个字预测培训目标仍然与目标零点学习目标不相符。为了解决这一弱点, 我们提议元调整, 通过对数据集收集的预培训语言模型进行微调, 直接优化零点学习目标。我们专注于分类任务, 通过汇总43个现有数据集和在问答(QA)格式中说明441个标签描述, 来“ 快速” 来构建元数据集。在对不可见的任务进行评估时, 元调整模型的大小比目标的QA模型和以前以自然语言为基础的SOTA零点学习系统要快。此外, 将参数计数从220M 到 770M 改进 AUC-ROC 的评分分 6.3%, 我们专注于分类任务, 通过汇总43 和预测更精确的模型, 甚至可以更好地测量其潜在的精确的模型。

0

相关内容

语言模型化

语言模型化

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

专知会员服务

60+阅读 · 2020年7月14日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

28+阅读 · 2020年6月13日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

专知会员服务

80+阅读 · 2020年3月5日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Towards General Deep Leakage in Federated Learning

Arxiv

0+阅读 · 2021年10月18日

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Arxiv

1+阅读 · 2021年10月15日

CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training

Arxiv

0+阅读 · 2021年10月14日

Self-conditioning pre-trained language models

Arxiv

0+阅读 · 2021年9月30日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

Curriculum Learning: A Survey

Arxiv

24+阅读 · 2021年1月25日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

A Meta-Learning Framework for Generalized Zero-Shot Learning

A Meta-Learning Framework for Generalized Zero-Shot Learning

Arxiv

3+阅读 · 2019年9月10日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

最新《3D医疗图像处理》综述论文，23页pdf，3D Deep Learning on Medical Images: A Review

专知会员服务

60+阅读 · 2020年7月14日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

28+阅读 · 2020年6月13日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

深度学习自然语言处理综述论文，Natural Language Processing Advancements By Deep Learning: A Survey

专知会员服务

80+阅读 · 2020年3月5日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Towards General Deep Leakage in Federated Learning

Arxiv

0+阅读 · 2021年10月18日

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Arxiv

1+阅读 · 2021年10月15日

CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training

Arxiv

0+阅读 · 2021年10月14日

Self-conditioning pre-trained language models

Arxiv

0+阅读 · 2021年9月30日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

Curriculum Learning: A Survey

Arxiv

24+阅读 · 2021年1月25日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

A Meta-Learning Framework for Generalized Zero-Shot Learning

A Meta-Learning Framework for Generalized Zero-Shot Learning

Arxiv

3+阅读 · 2019年9月10日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

微信扫码咨询专知VIP会员