带有树预测的变形器中自然构成特性 (Characterizing Intrinsic Compositionality in Transformers with Tree Projections) - 专知论文

会员服务 ·

0

变换 · MoDELS · Projection · 组合性 · Learning ·

2022 年 11 月 3 日

Characterizing Intrinsic Compositionality in Transformers with Tree Projections

翻译：带有树预测的变形器中自然构成特性

Shikhar Murty,Pratyusha Sharma,Jacob Andreas,Christopher D. Manning

from arxiv, Fixed title and metadata

When trained on language data, do transformers learn some arbitrary computation that utilizes the full capacity of the architecture or do they learn a simpler, tree-like computation, hypothesized to underlie compositional meaning systems like human languages? There is an apparent tension between compositional accounts of human language understanding, which are based on a restricted bottom-up computational process, and the enormous success of neural models like transformers, which can route information arbitrarily between different parts of their input. One possibility is that these models, while extremely flexible in principle, in practice learn to interpret language hierarchically, ultimately building sentence representations close to those predictable by a bottom-up, tree-structured model. To evaluate this possibility, we describe an unsupervised and parameter-free method to \emph{functionally project} the behavior of any transformer into the space of tree-structured networks. Given an input sentence, we produce a binary tree that approximates the transformer's representation-building process and a score that captures how "tree-like" the transformer's behavior is on the input. While calculation of this score does not require training any additional models, it provably upper-bounds the fit between a transformer and any tree-structured approximation. Using this method, we show that transformers for three different tasks become more tree-like over the course of training, in some cases unsupervisedly recovering the same trees as supervised parsers. These trees, in turn, are predictive of model behavior, with more tree-like models generalizing better on tests of compositional generalization.

翻译：语言数据培训后, 变压器会学会一些任意的计算, 使用结构的全部能力, 或者学习一种更简单的、树类式的计算, 假设它能成为像人类语言这样的组成含义体系的基础? 人类语言理解的构成账户, 其基础是有限的自下而上计算过程, 以及任何变压器等神经模型的巨大成功, 它可以任意在输入的不同部分之间传递信息。一种可能性是, 这些模型, 原则上极其灵活, 在实践中学会按等级来解释语言, 最终用一个自下而上、树类结构化模型来构建句子代表那些可以预测的句子。为了评估这种可能性, 我们描述一种非超上和无参数的方法, 来描述任何变压的和无参数的方法, 任何变压的变压的树型模型, 以及任何变压的变压式的树型模型, 成为一种不固定的变压的树型模型。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

松香基二萜类手性昆虫拒食剂的合成及定量构效关系研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Cycloartane型三萜抗肝损伤构效关系和作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

β-arrestin 2负调控TLR9信号抑制肺腺癌的增殖与转移

国家自然科学基金

0+阅读 · 2012年12月31日

妊娠小鼠骨组织TRPV5和TRPV6的表达规律及其在胎盘钙转运中的作用探讨

国家自然科学基金

0+阅读 · 2012年12月31日

多元代数插值的计算机数学方法

国家自然科学基金

1+阅读 · 2011年12月31日

La(2-x)GdxHf2O7:RE新型闪烁透明陶瓷的晶体结构和闪烁性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

TRPC和ORAI1协同构成钙池操纵的钙通道(SOC)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

小角X-射线散射法研究多结构域蛋白质DC-UbP和USP5的溶液结构

国家自然科学基金

0+阅读 · 2009年12月31日

Masked autoencoders are effective solution to transformer data-hungry

Arxiv

0+阅读 · 2022年12月26日

Bias Mitigation Framework for Intersectional Subgroups in Neural Networks

Arxiv

0+阅读 · 2022年12月26日

Sparse M-estimators in semi-parametric copula models

Arxiv

0+阅读 · 2022年12月24日

A Universal Law of Robustness via Isoperimetry

Arxiv

0+阅读 · 2022年12月23日

Multi-objective and multi-fidelity Bayesian optimization of laser-plasma acceleration

Multi-objective and multi-fidelity Bayesian optimization of laser-plasma acceleration

Arxiv

0+阅读 · 2022年12月23日

On Characterizing the Trade-off in Invariant Representation Learning

Arxiv

0+阅读 · 2022年12月22日

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

Arxiv

0+阅读 · 2022年12月20日

Latent Evolution Model for Change Point Detection in Time-varying Networks

Arxiv

0+阅读 · 2022年12月17日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Masked autoencoders are effective solution to transformer data-hungry

Arxiv

0+阅读 · 2022年12月26日

Bias Mitigation Framework for Intersectional Subgroups in Neural Networks

Arxiv

0+阅读 · 2022年12月26日

Sparse M-estimators in semi-parametric copula models

Arxiv

0+阅读 · 2022年12月24日

A Universal Law of Robustness via Isoperimetry

Arxiv

0+阅读 · 2022年12月23日

Multi-objective and multi-fidelity Bayesian optimization of laser-plasma acceleration

Multi-objective and multi-fidelity Bayesian optimization of laser-plasma acceleration

Arxiv

0+阅读 · 2022年12月23日

On Characterizing the Trade-off in Invariant Representation Learning

Arxiv

0+阅读 · 2022年12月22日

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

Arxiv

0+阅读 · 2022年12月20日

Latent Evolution Model for Change Point Detection in Time-varying Networks

Arxiv

0+阅读 · 2022年12月17日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

相关基金

松香基二萜类手性昆虫拒食剂的合成及定量构效关系研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Cycloartane型三萜抗肝损伤构效关系和作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

β-arrestin 2负调控TLR9信号抑制肺腺癌的增殖与转移

国家自然科学基金

0+阅读 · 2012年12月31日

妊娠小鼠骨组织TRPV5和TRPV6的表达规律及其在胎盘钙转运中的作用探讨

国家自然科学基金

0+阅读 · 2012年12月31日

多元代数插值的计算机数学方法

国家自然科学基金

1+阅读 · 2011年12月31日

La(2-x)GdxHf2O7:RE新型闪烁透明陶瓷的晶体结构和闪烁性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

TRPC和ORAI1协同构成钙池操纵的钙通道(SOC)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

小角X-射线散射法研究多结构域蛋白质DC-UbP和USP5的溶液结构

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员