内存旋转单元 (Rotational Unit of Memory) - 专知论文

会员服务 ·

0

state-of-the-art · RNN · Performer · 梯度消失与爆炸问题 · MoDELS ·

2017 年 10 月 26 日

Rotational Unit of Memory

翻译：内存旋转单元

Rumen Dangovski,Li Jing,Marin Soljacic

The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In this paper we propose a novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM). The core of RUM is its rotational operation, which is, naturally, a unitary matrix, providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Moreover, the rotational unit also serves as associative memory. We evaluate our model on synthetic memorization, question answering and language modeling tasks. RUM learns the Copying Memory task completely and improves the state-of-the-art result in the Recall task. RUM's performance in the bAbI Question Answering task is comparable to that of models with attention mechanism. We also improve the state-of-the-art result to 1.189 bits-per-character (BPC) loss in the Character Level Penn Treebank (PTB) task, which is to signify the applications of RUM to real-world sequential data. The universality of our construction, at the core of RNN, establishes RUM as a promising approach to language modeling, speech recognition and machine translation.

翻译：统一进化矩阵和连带记忆的概念将常规神经网络(RNN)的功能提升到常规神经网络(RNN)的领域,使其在各种相继任务中达到最先进的性能。然而,RNN仍然有有限的能力来操纵长期记忆。为绕过这一弱点,RNN的最成功应用使用外部技术,如关注机制等。在本文件中,我们提出了一个新的RNNN模式,统一了最先进的方法:记忆旋转单元(RUM)。RUM的核心是其旋转操作,它自然是一个统一的矩阵,通过克服逐渐消失和爆炸的梯度问题,为学习长期依赖性学习的架构提供力量。此外,轮用单位也起到连带记忆的作用。我们评价了我们的合成记忆、问题回答和语言模型模型模型,彻底地学习了记忆任务,改进了RMUM(RMMU)的状态和状态。RUMU的功能在B解答任务中的表现与RPR-189号模型的实际模型的模型的翻版,我们把RPRPRB的正级数据转换到RPL的模型的版本。

0

相关内容

state-of-the-art

state-of-the-art

【ICML2020-华为港科大】RNN和LSTM有长期记忆吗？

【ICML2020-华为港科大】RNN和LSTM有长期记忆吗？

专知会员服务

78+阅读 · 2020年6月25日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【ICLR2020-】基于记忆的图网络，MEMORY-BASED GRAPH NETWORKS

【ICLR2020-】基于记忆的图网络，MEMORY-BASED GRAPH NETWORKS

专知会员服务

110+阅读 · 2020年2月22日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

47+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

Simple Recurrent Unit For Sentence Classification

Simple Recurrent Unit For Sentence Classification

哈工大SCIR

6+阅读 · 2017年11月29日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Highway Networks For Sentence Classification

Highway Networks For Sentence Classification

哈工大SCIR

4+阅读 · 2017年9月30日

Do RNN and LSTM have Long Memory?

Do RNN and LSTM have Long Memory?

Arxiv

19+阅读 · 2020年6月10日

Learning by Abstraction: The Neural State Machine

Learning by Abstraction: The Neural State Machine

Arxiv

6+阅读 · 2019年7月11日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

Deep Short Text Classification with Knowledge Powered Attention

Arxiv

8+阅读 · 2019年2月21日

Analysis Methods in Neural Language Processing: A Survey

Analysis Methods in Neural Language Processing: A Survey

Arxiv

4+阅读 · 2019年1月14日

Pay More Attention - Neural Architectures for Question-Answering

Arxiv

5+阅读 · 2018年3月25日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

A Read-Write Memory Network for Movie Story Understanding

Arxiv

5+阅读 · 2018年3月16日

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Arxiv

6+阅读 · 2018年3月14日

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Arxiv

4+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

state-of-the-art

梯度消失与爆炸问题

相关VIP内容

【ICML2020-华为港科大】RNN和LSTM有长期记忆吗？

【ICML2020-华为港科大】RNN和LSTM有长期记忆吗？

专知会员服务

78+阅读 · 2020年6月25日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【ICLR2020-】基于记忆的图网络，MEMORY-BASED GRAPH NETWORKS

【ICLR2020-】基于记忆的图网络，MEMORY-BASED GRAPH NETWORKS

专知会员服务

110+阅读 · 2020年2月22日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

47+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

Simple Recurrent Unit For Sentence Classification

Simple Recurrent Unit For Sentence Classification

哈工大SCIR

6+阅读 · 2017年11月29日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Highway Networks For Sentence Classification

Highway Networks For Sentence Classification

哈工大SCIR

4+阅读 · 2017年9月30日

相关论文

Do RNN and LSTM have Long Memory?

Do RNN and LSTM have Long Memory?

Arxiv

19+阅读 · 2020年6月10日

Learning by Abstraction: The Neural State Machine

Learning by Abstraction: The Neural State Machine

Arxiv

6+阅读 · 2019年7月11日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

Deep Short Text Classification with Knowledge Powered Attention

Arxiv

8+阅读 · 2019年2月21日

Analysis Methods in Neural Language Processing: A Survey

Analysis Methods in Neural Language Processing: A Survey

Arxiv

4+阅读 · 2019年1月14日

Pay More Attention - Neural Architectures for Question-Answering

Arxiv

5+阅读 · 2018年3月25日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

A Read-Write Memory Network for Movie Story Understanding

Arxiv

5+阅读 · 2018年3月16日

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Arxiv

6+阅读 · 2018年3月14日

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Arxiv

4+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员