上下文单词表示式的低多维线性线性直线几何测量 (The Low-Dimensional Linear Geometry of Contextualized Word Representations) - 专知论文

会员服务 ·

0

线性的 · 子空间 · Performer · BERT · MoDELS ·

2021 年 9 月 14 日

The Low-Dimensional Linear Geometry of Contextualized Word Representations

翻译：上下文单词表示式的低多维线性线性直线几何测量

Evan Hernandez,Jacob Andreas

from arxiv, To be published in the 25th Conference on Computational Natural Language Learning (CoNLL)

Black-box probing models can reliably extract linguistic features like tense, number, and syntactic role from pretrained word representations. However, the manner in which these features are encoded in representations remains poorly understood. We present a systematic study of the linear geometry of contextualized word representations in ELMO and BERT. We show that a variety of linguistic features (including structured dependency relationships) are encoded in low-dimensional subspaces. We then refine this geometric picture, showing that there are hierarchical relations between the subspaces encoding general linguistic categories and more specific ones, and that low-dimensional feature encodings are distributed rather than aligned to individual neurons. Finally, we demonstrate that these linear subspaces are causally related to model behavior, and can be used to perform fine-grained manipulation of BERT's output distribution.

翻译：黑盒检验模型可以可靠地从经过训练的字形演示中提取语言特征,如时态、数字和综合作用。但是,这些特征的编码方式仍然不易理解。我们对ELMO和BERT中背景化字形表达的线性几何学进行系统研究。我们显示,在低维次空间中,有多种语言特征(包括结构上的依赖关系)编码。然后,我们细化这一几何图画,显示子空间编码一般语言类别和更具体的类别之间有等级关系,低维特征编码是分布的,而不是与单个神经元一致的。最后,我们证明这些线性子空间与模式行为有因果关系,可以用来对BERT的输出分布进行精细细的操纵。

0

相关内容

线性的

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

专知会员服务

111+阅读 · 2020年11月17日

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

专知会员服务

55+阅读 · 2020年7月3日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

专知会员服务

46+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

专知会员服务

21+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Equivariant Deep Dynamical Model for Motion Prediction

Arxiv

0+阅读 · 2021年11月2日

Directed Graph Embeddings in Pseudo-Riemannian Manifolds

Arxiv

12+阅读 · 2021年6月16日

AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding

Arxiv

5+阅读 · 2020年10月6日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Deep Network Embedding for Graph Representation Learning in Signed Networks

Arxiv

4+阅读 · 2019年1月7日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Arxiv

3+阅读 · 2018年2月1日

VIP会员

文章信息

相关主题

相关VIP内容

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

专知会员服务

111+阅读 · 2020年11月17日

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

专知会员服务

55+阅读 · 2020年7月3日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

专知会员服务

46+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

专知会员服务

21+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Equivariant Deep Dynamical Model for Motion Prediction

Arxiv

0+阅读 · 2021年11月2日

Directed Graph Embeddings in Pseudo-Riemannian Manifolds

Arxiv

12+阅读 · 2021年6月16日

AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding

Arxiv

5+阅读 · 2020年10月6日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Deep Network Embedding for Graph Representation Learning in Signed Networks

Arxiv

4+阅读 · 2019年1月7日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Arxiv

3+阅读 · 2018年2月1日

微信扫码咨询专知VIP会员