不加系统化地普遍化:关于按顺序排列的经常网络的构成技能 (Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks) - 专知论文

会员服务 ·

0

泛化理论 · 循环网络 · SCAN · 可理解性 · Networking ·

2018 年 6 月 6 日

Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

翻译：不加系统化地普遍化:关于按顺序排列的经常网络的构成技能

Brenden M. Lake,Marco Baroni

from arxiv, Published at the 35th International Conference on Machine Learning (ICML 2018)

Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generalizations when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.

翻译：人类可以不费力地理解并产生新的发音, 因为他们的构成技能。一旦一个人学会了一个新的动词“ 达克斯” 的含义, 他或她可以立即理解“ 达克斯两次” 或“ 萨克斯 ” 或“ 达克斯 ” 的含义。在本文中, 我们引入 SCAN 域, 由一套简单的组成导航命令组成, 并配以相应的动作序列。然后我们用从顺序到顺序的方法测试在 SCAN 上培训过的各种经常性神经网络( RNNS ) 的零光谱化能力。我们发现, 当培训和测试命令之间的差异很小时, RNN 就可以成功进行零光化的概括化, 这样他们就可以应用“ 混合和匹配” 战略来解决问题。但是, 当一般化需要系统化的构成技能( 如上面的“ 达克斯 ” 示例), RNNNS 就会大失常。我们最后用神经机翻译的校准实验来结束。我们发现, 缺乏系统性可能是神经网络的臭的训练数据缺乏的部分责任。

3

相关内容

泛化理论

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

86+阅读 · 2020年6月23日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CAAI 2019】XLNet and Beyond，杨植麟，联合创始人，循环智能（Recurrent AI）

【CAAI 2019】XLNet and Beyond，杨植麟，联合创始人，循环智能（Recurrent AI）

专知会员服务

14+阅读 · 2019年12月4日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VALSE Webinar 特别专题之产学研共舞VALSE

VALSE Webinar 特别专题之产学研共舞VALSE

VALSE

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

Cloze-driven Pretraining of Self-attention Networks

Arxiv

6+阅读 · 2019年3月19日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

Relational recurrent neural networks

Relational recurrent neural networks

Arxiv

8+阅读 · 2018年6月28日

Classical Structured Prediction Losses for Sequence to Sequence Learning

Arxiv

6+阅读 · 2018年5月24日

Token-level and sequence-level loss smoothing for RNN language models

Arxiv

7+阅读 · 2018年5月14日

End-to-End Video Captioning with Multitask Reinforcement Learning

Arxiv

5+阅读 · 2018年3月21日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

VIP会员

文章信息

相关主题

相关VIP内容

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

86+阅读 · 2020年6月23日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CAAI 2019】XLNet and Beyond，杨植麟，联合创始人，循环智能（Recurrent AI）

【CAAI 2019】XLNet and Beyond，杨植麟，联合创始人，循环智能（Recurrent AI）

专知会员服务

14+阅读 · 2019年12月4日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

VALSE Webinar 特别专题之产学研共舞VALSE

VALSE Webinar 特别专题之产学研共舞VALSE

VALSE

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

Cloze-driven Pretraining of Self-attention Networks

Arxiv

6+阅读 · 2019年3月19日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

Relational recurrent neural networks

Relational recurrent neural networks

Arxiv

8+阅读 · 2018年6月28日

Classical Structured Prediction Losses for Sequence to Sequence Learning

Arxiv

6+阅读 · 2018年5月24日

Token-level and sequence-level loss smoothing for RNN language models

Arxiv

7+阅读 · 2018年5月14日

End-to-End Video Captioning with Multitask Reinforcement Learning

Arxiv

5+阅读 · 2018年3月21日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

微信扫码咨询专知VIP会员