证明预培训代表的诱导偏见的量化 (Probing as Quantifying the Inductive Bias of Pre-trained Representations) - 专知论文

会员服务 ·

0

归纳偏好 · 有偏 · 基于上下文的表示 · Extensibility · 贝叶斯推断 ·

2021 年 10 月 15 日

Probing as Quantifying the Inductive Bias of Pre-trained Representations

翻译：证明预培训代表的诱导偏见的量化

Alexander Immer,Lucas Torroba Hennigen,Vincent Fortuin,Ryan Cotterell

Pre-trained contextual representations have led to dramatic performance improvements on a range of downstream tasks. This has motivated researchers to quantify and understand the linguistic information encoded in them. In general, this is done by probing, which consists of training a supervised model to predict a linguistic property from said representations. Unfortunately, this definition of probing has been subject to extensive criticism, and can lead to paradoxical or counter-intuitive results. In this work, we present a novel framework for probing where the goal is to evaluate the inductive bias of representations for a particular task, and provide a practical avenue to do this using Bayesian inference. We apply our framework to a series of token-, arc-, and sentence-level tasks. Our results suggest that our framework solves problems of previous approaches and that fastText can offer a better inductive bias than BERT in certain situations.

翻译：未经培训的背景介绍导致一系列下游任务的业绩显著改善,这促使研究人员量化和理解其中所编码的语言信息。一般而言,这是通过调查实现的,调查包括培训一种监督模型,以预测上述表述中的语言财产。不幸的是,这种调查的定义受到广泛批评,可能导致自相矛盾或反直觉的结果。在这项工作中,我们提出了一个新的调查框架,目的是评估某一特定任务中代表的诱导偏差,并提供一个实用的途径,利用巴耶斯语的推理进行这项工作。我们将我们的框架应用于一系列象征性、弧和判决层面的任务。我们的结果表明,我们的框架解决了以前方法的问题,快速技术在某些情况下可以提供比BERT更好的诱导偏差。

0

相关内容

归纳偏好

NeurIPS 2021教程|OpenAI-Lilian Weng等：自监督学习与对比学习，105页ppt，

NeurIPS 2021教程|OpenAI-Lilian Weng等：自监督学习与对比学习，105页ppt，

专知会员服务

77+阅读 · 2021年12月10日

对比学习简述

专知会员服务

88+阅读 · 2021年6月29日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

49+阅读 · 2020年5月26日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

39+阅读 · 2020年4月17日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

10+阅读 · 2020年1月7日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

30+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

已删除

将门创投

8+阅读 · 2017年7月21日

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

Arxiv

0+阅读 · 2021年12月15日

Probing Linguistic Information For Logical Inference In Pre-trained Language Models

Arxiv

5+阅读 · 2021年12月3日

Learning Optimal Representations with the Decodable Information Bottleneck

Arxiv

6+阅读 · 2020年9月27日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

111+阅读 · 2020年3月18日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

VIP会员

文章信息

相关主题

基于上下文的表示

贝叶斯推断

相关VIP内容

NeurIPS 2021教程|OpenAI-Lilian Weng等：自监督学习与对比学习，105页ppt，

NeurIPS 2021教程|OpenAI-Lilian Weng等：自监督学习与对比学习，105页ppt，

专知会员服务

77+阅读 · 2021年12月10日

对比学习简述

专知会员服务

88+阅读 · 2021年6月29日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

49+阅读 · 2020年5月26日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

39+阅读 · 2020年4月17日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

10+阅读 · 2020年1月7日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

30+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

热门VIP内容

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

已删除

将门创投

8+阅读 · 2017年7月21日

相关论文

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

Arxiv

0+阅读 · 2021年12月15日

Probing Linguistic Information For Logical Inference In Pre-trained Language Models

Arxiv

5+阅读 · 2021年12月3日

Learning Optimal Representations with the Decodable Information Bottleneck

Arxiv

6+阅读 · 2020年9月27日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Pre-trained Models for Natural Language Processing: A Survey

Arxiv

111+阅读 · 2020年3月18日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Pre-trained Language Model Representations for Language Generation

Arxiv

5+阅读 · 2019年4月1日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

微信扫码咨询专知VIP会员