循环中的变异器:语言神经模型中的极度 (Transformers in the loop: Polarity in neural models of language) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 环 · Performer · 变换 ·

2021 年 9 月 8 日

Transformers in the loop: Polarity in neural models of language

翻译：循环中的变异器:语言神经模型中的极度

Lisa Bylinina,Alexey Tikhonov

Representation of linguistic phenomena in computational language models is typically assessed against the predictions of existing linguistic theories of these phenomena. Using the notion of polarity as a case study, we show that this is not always the most adequate set-up. We probe polarity via so-called 'negative polarity items' (in particular, English 'any') in two pre-trained Transformer-based models (BERT and GPT-2). We show that -- at least for polarity -- metrics derived from language models are more consistent with data from psycholinguistic experiments than linguistic theory predictions. Establishing this allows us to more adequately evaluate the performance of language models and also to use language models to discover new insights into natural language grammar beyond existing linguistic theories. Overall, our results encourage a closer tie between experiments with human subjects and with language models. We propose methods to enable this closer tie, with language models as part of experimental pipeline, and show this pipeline at work.

翻译：在计算语言模型中,语言现象在计算语言模型中的代表性通常根据对这些现象现有语言理论的预测进行评估。我们用极性概念作为案例研究,表明这并不总是最适当的设置。我们通过两个预先培训的基于变异器模型(BERT和GPT-2)(特别是英文“any”)的所谓“负极性项目”来探测极性。我们显示,至少对极性而言,语言模型得出的指标比语言理论预测更符合心理语言实验的数据。建立这个概念,使我们能够更充分地评估语言模型的性能,并利用语言模型在现有语言理论之外对自然语言语法进行新的洞察。总体而言,我们的成果鼓励与人类主体和语言模型进行更密切的实验。我们提出了使这种密切关联与语言模型作为实验管道的一部分更紧密的方法,并在工作时展示这一管道。

0

相关内容

语言模型化

语言模型化

ICML 2021论文收录

ICML 2021论文收录

专知会员服务

123+阅读 · 2021年5月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

123+阅读 · 2020年5月30日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

已删除

将门创投

4+阅读 · 2018年11月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

Arxiv

10+阅读 · 2021年10月12日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

Sparse Sequence-to-Sequence Models

Sparse Sequence-to-Sequence Models

Arxiv

5+阅读 · 2019年5月14日

Visualizing Attention in Transformer-Based Language Representation Models

Visualizing Attention in Transformer-Based Language Representation Models

Arxiv

3+阅读 · 2019年4月11日

Analysis Methods in Neural Language Processing: A Survey

Analysis Methods in Neural Language Processing: A Survey

Arxiv

4+阅读 · 2019年1月14日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICML 2021论文收录

ICML 2021论文收录

专知会员服务

123+阅读 · 2021年5月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

123+阅读 · 2020年5月30日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

已删除

将门创投

4+阅读 · 2018年11月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

Arxiv

10+阅读 · 2021年10月12日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

Sparse Sequence-to-Sequence Models

Sparse Sequence-to-Sequence Models

Arxiv

5+阅读 · 2019年5月14日

Visualizing Attention in Transformer-Based Language Representation Models

Visualizing Attention in Transformer-Based Language Representation Models

Arxiv

3+阅读 · 2019年4月11日

Analysis Methods in Neural Language Processing: A Survey

Analysis Methods in Neural Language Processing: A Survey

Arxiv

4+阅读 · 2019年1月14日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员