经常语文建模的多复制模式 (Multiplicative Models for Recurrent Language Modeling) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 循环神经网络 · 简单循环神经网络 · Neural Networks ·

2019 年 6 月 30 日

Multiplicative Models for Recurrent Language Modeling

翻译：经常语文建模的多复制模式

Diego Maupomé,Marie-Jean Meurs

from arxiv, 10 pages, pre-print from Proceedings of CICLing 2019: 20th International Conference on Computational Linguistics and Intelligent Text Processing

Recently, there has been interest in multiplicative recurrent neural networks for language modeling. Indeed, simple Recurrent Neural Networks (RNNs) encounter difficulties recovering from past mistakes when generating sequences due to high correlation between hidden states. These challenges can be mitigated by integrating second-order terms in the hidden-state update. One such model, multiplicative Long Short-Term Memory (mLSTM) is particularly interesting in its original formulation because of the sharing of its second-order term, referred to as the intermediate state. We explore these architectural improvements by introducing new models and testing them on character-level language modeling tasks. This allows us to establish the relevance of shared parametrization in recurrent language modeling.

翻译：最近,人们对多种重复性经常性神经网络进行语言建模感兴趣。事实上,简单的经常性神经网络(RNN)在生成隐藏状态之间高度关联的序列时遇到从过去错误中恢复过来的困难。这些挑战可以通过在隐藏状态更新中整合二阶术语来缓解。其中一个模型,即多倍性长短期内存(mLSTM),在最初的编制中特别有趣,因为它的第二阶术语被称为中间状态。我们通过引入新模型和测试字符级语言建模任务来探索这些建筑改进。这使我们能够在经常性语言建模中建立共同的对应关系。

0

相关内容

语言模型化

语言模型化

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

专知会员服务

23+阅读 · 2020年1月28日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

Simple Recurrent Unit For Sentence Classification

Simple Recurrent Unit For Sentence Classification

哈工大SCIR

6+阅读 · 2017年11月29日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

Memory-Attended Recurrent Network for Video Captioning

Arxiv

7+阅读 · 2019年5月10日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Arxiv

39+阅读 · 2019年1月17日

Reversible Recurrent Neural Networks

Arxiv

3+阅读 · 2018年10月25日

Relational recurrent neural networks

Relational recurrent neural networks

Arxiv

8+阅读 · 2018年6月28日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Dual Recurrent Attention Units for Visual Question Answering

Arxiv

7+阅读 · 2018年2月1日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

Recurrent Instance Segmentation

Arxiv

5+阅读 · 2016年10月24日

VIP会员

文章信息

相关主题

语言模型化

循环神经网络

简单循环神经网络

Neural Networks

相关VIP内容

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

专知会员服务

23+阅读 · 2020年1月28日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

训练扩散模型其实比你想象的更简单！何恺明团队新作Dispersive Loss：给扩散模型加正则化

【ICML2025】用于可扩展持续强化学习的自组合策略

最新4500字《死亡算法：人工智能如何助推对加沙的大规模杀戮》（附原文）

人工智能行业：2027年AI预测报告

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

Simple Recurrent Unit For Sentence Classification

Simple Recurrent Unit For Sentence Classification

哈工大SCIR

6+阅读 · 2017年11月29日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

Memory-Attended Recurrent Network for Video Captioning

Arxiv

7+阅读 · 2019年5月10日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Arxiv

39+阅读 · 2019年1月17日

Reversible Recurrent Neural Networks

Arxiv

3+阅读 · 2018年10月25日

Relational recurrent neural networks

Relational recurrent neural networks

Arxiv

8+阅读 · 2018年6月28日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Dual Recurrent Attention Units for Visual Question Answering

Arxiv

7+阅读 · 2018年2月1日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

Recurrent Instance Segmentation

Arxiv

5+阅读 · 2016年10月24日

微信扫码咨询专知VIP会员