R2D2:基于可解释的等级语言建模可辨别树形的累进式变换器 (R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 变换 · Processing（编程语言） · 双向语言模型 ·

2021 年 7 月 2 日

R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

翻译：R2D2:基于可解释的等级语言建模可辨别树形的累进式变换器

Xiang Hu,Haitao Mi,Zujie Wen,Yafang Wang,Yi Su,Jing Zheng,Gerard de Melo

from arxiv, To be published in the proceedings of ACL-IJCNLP 2021

Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. This paper proposes a recursive Transformer model based on differentiable CKY style binary trees to emulate the composition process. We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps. Experimental results on language modeling and unsupervised parsing show the effectiveness of our approach.

翻译：人类语言理解在颗粒度(如文字、短语和句子)的多个层次上运作,其抽象程度不断提高,可以按等级进行组合;然而,现有具有堆叠层的深层模型并不明确地模拟任何等级过程。本文建议采用基于不同CKY风格双树的循环变异模型,以效仿组成过程。我们把双向语言模型培训前目标扩展至这一结构,试图根据左边和右边的抽象节点预测每个词。为了扩大我们的方法,我们还采用了高效的修剪树诱导算法,以便能够在成型步骤的直线数中进行编码。语言建模的实验结果和不受监督的分解显示了我们的方法的有效性。

0

相关内容

语言模型化

语言模型化

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【上海交大】可解释CNN的对象分类，Interpretable CNNs for Object Classification

专知会员服务

54+阅读 · 2020年3月14日

【上海交通大学-张拳石】可解释CNN，Interpretable CNNs for Object Classification

【上海交通大学-张拳石】可解释CNN，Interpretable CNNs for Object Classification

专知会员服务

46+阅读 · 2020年3月13日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

【论文推荐ICLR2020】组合语义解释Transformers/RNNs，explaining compositional semantics for Transformers/RNNs

【论文推荐ICLR2020】组合语义解释Transformers/RNNs，explaining compositional semantics for Transformers/RNNs

专知会员服务

6+阅读 · 2019年12月24日

【ICLR2020】面向层次重要性属性:神经序列模型的组成语义解释（Towards Hierarchical Importance Attribution:explaining compositional semantics for Neural Sequence Models）

【ICLR2020】面向层次重要性属性:神经序列模型的组成语义解释（Towards Hierarchical Importance Attribution:explaining compositional semantics for Neural Sequence Models）

专知会员服务

10+阅读 · 2019年12月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Dynamically Pruned Message Passing Networks for Large-Scale Knowledge Graph Reasoning

Arxiv

6+阅读 · 2019年9月27日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Arxiv

5+阅读 · 2018年12月26日

Contextualized Non-local Neural Networks for Sequence Learning

Contextualized Non-local Neural Networks for Sequence Learning

Arxiv

3+阅读 · 2018年11月21日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

13+阅读 · 2018年6月26日

DARTS: Differentiable Architecture Search

Arxiv

3+阅读 · 2018年6月24日

Recent Trends in Deep Learning Based Natural Language Processing

Arxiv

7+阅读 · 2018年2月20日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

语言模型化

Processing（编程语言）

双向语言模型

相关VIP内容

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【上海交大】可解释CNN的对象分类，Interpretable CNNs for Object Classification

专知会员服务

54+阅读 · 2020年3月14日

【上海交通大学-张拳石】可解释CNN，Interpretable CNNs for Object Classification

【上海交通大学-张拳石】可解释CNN，Interpretable CNNs for Object Classification

专知会员服务

46+阅读 · 2020年3月13日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

【论文推荐ICLR2020】组合语义解释Transformers/RNNs，explaining compositional semantics for Transformers/RNNs

【论文推荐ICLR2020】组合语义解释Transformers/RNNs，explaining compositional semantics for Transformers/RNNs

专知会员服务

6+阅读 · 2019年12月24日

【ICLR2020】面向层次重要性属性:神经序列模型的组成语义解释（Towards Hierarchical Importance Attribution:explaining compositional semantics for Neural Sequence Models）

【ICLR2020】面向层次重要性属性:神经序列模型的组成语义解释（Towards Hierarchical Importance Attribution:explaining compositional semantics for Neural Sequence Models）

专知会员服务

10+阅读 · 2019年12月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Dynamically Pruned Message Passing Networks for Large-Scale Knowledge Graph Reasoning

Arxiv

6+阅读 · 2019年9月27日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Arxiv

5+阅读 · 2018年12月26日

Contextualized Non-local Neural Networks for Sequence Learning

Contextualized Non-local Neural Networks for Sequence Learning

Arxiv

3+阅读 · 2018年11月21日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

13+阅读 · 2018年6月26日

DARTS: Differentiable Architecture Search

Arxiv

3+阅读 · 2018年6月24日

Recent Trends in Deep Learning Based Natural Language Processing

Arxiv

7+阅读 · 2018年2月20日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员