对神经语言模型中协同协议机制的因果分析 (Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 神经语言模型 · 相似度 · Performer ·

2021 年 6 月 15 日

Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models

翻译：对神经语言模型中协同协议机制的因果分析

Matthew Finlayson,Aaron Mueller,Stuart Shieber,Sebastian Gehrmann,Tal Linzen,Yonatan Belinkov

from arxiv, Accepted to ACL-IJCNLP 2021. Code can be found at https://github.com/mattf1n/lm-intervention

Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural language models. We investigate the magnitude of models' preferences for grammatical inflections, as well as whether neurons process subject-verb agreement similarly across sentences with different syntactic structures. We uncover similarities and differences across architectures and model sizes -- notably, that larger models do not necessarily learn stronger preferences. We also observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure of the input sentence. Finally, we find that language models rely on similar sets of neurons when given sentences with similar syntactic structure.

翻译：有针对性的综合评估表明语言模型在困难的情况下有能力执行主题动词协议。为了阐明模型完成这一行为的机制,本研究将因果调解分析应用于培训前神经语言模型。我们调查模型对语法反射的偏好程度,以及神经元处理主题动词协议在与不同合成结构的句子之间是否类似。我们发现不同结构和模型大小的相似和不同之处 -- -- 特别是较大的模型不一定学会更强烈的偏好。我们还观察到根据输入句的合成结构制作主题动词协议的两个不同机制。最后,我们发现语言模型在判刑时依赖类似的神经组。

0

相关内容

语言模型化

语言模型化

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【ST2020硬核课】深度学习即统计学习，50页ppt

【ST2020硬核课】深度学习即统计学习，50页ppt

专知会员服务

67+阅读 · 2020年8月17日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

专知会员服务

34+阅读 · 2020年3月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

一文读懂依存句法分析

一文读懂依存句法分析

AINLP

16+阅读 · 2019年4月28日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

Arxiv

3+阅读 · 2021年6月11日

Weight Poisoning Attacks on Pre-trained Models

Weight Poisoning Attacks on Pre-trained Models

Arxiv

5+阅读 · 2020年4月14日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Learning by Abstraction: The Neural State Machine

Learning by Abstraction: The Neural State Machine

Arxiv

6+阅读 · 2019年7月11日

What Does BERT Look At? An Analysis of BERT's Attention

Arxiv

4+阅读 · 2019年6月11日

How do you correct run-on sentences it's not as easy as it seems

Arxiv

4+阅读 · 2018年9月21日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

A Stochastic Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年5月28日

Inference in Probabilistic Graphical Models by Graph Neural Networks

Arxiv

3+阅读 · 2018年5月25日

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

Arxiv

3+阅读 · 2018年1月23日

VIP会员

文章信息

相关主题

语言模型化

神经语言模型

相关VIP内容

【EMNLP2020】自然语言生成，Neural Language Generation

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【ST2020硬核课】深度学习即统计学习，50页ppt

【ST2020硬核课】深度学习即统计学习，50页ppt

专知会员服务

67+阅读 · 2020年8月17日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

专知会员服务

34+阅读 · 2020年3月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

一文读懂依存句法分析

一文读懂依存句法分析

AINLP

16+阅读 · 2019年4月28日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

Arxiv

3+阅读 · 2021年6月11日

Weight Poisoning Attacks on Pre-trained Models

Weight Poisoning Attacks on Pre-trained Models

Arxiv

5+阅读 · 2020年4月14日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Learning by Abstraction: The Neural State Machine

Learning by Abstraction: The Neural State Machine

Arxiv

6+阅读 · 2019年7月11日

What Does BERT Look At? An Analysis of BERT's Attention

Arxiv

4+阅读 · 2019年6月11日

How do you correct run-on sentences it's not as easy as it seems

Arxiv

4+阅读 · 2018年9月21日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

A Stochastic Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年5月28日

Inference in Probabilistic Graphical Models by Graph Neural Networks

Arxiv

3+阅读 · 2018年5月25日

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

Arxiv

3+阅读 · 2018年1月23日

微信扫码咨询专知VIP会员