学习LSTM语言模式的多时间代表性学习 (Multi-timescale representation learning in LSTM Language Models) - 专知论文

会员服务 ·

0

语言模型化 · INFORMS · 遗忘门 · MoDELS · 长短期记忆网络 ·

2020 年 9 月 27 日

Multi-timescale representation learning in LSTM Language Models

翻译：学习LSTM语言模式的多时间代表性学习

Shivangi Mahto,Vy A. Vo,Javier S. Turek,Alexander G. Huth

Although neural language models are effective at capturing statistics of natural language, their representations are challenging to interpret. In particular, it is unclear how these models retain information over multiple timescales. In this work, we construct explicitly multi-timescale language models by manipulating the input and forget gate biases in a long short-term memory (LSTM) network. The distribution of timescales is selected to approximate power law statistics of natural language through a combination of exponentially decaying memory cells. We then empirically analyze the timescale of information routed through each part of the model using word ablation experiments and forget gate visualizations. These experiments show that the multi-timescale model successfully learns representations at the desired timescales, and that the distribution includes longer timescales than a standard LSTM. Further, information about high-,mid-, and low-frequency words is routed preferentially through units with the appropriate timescales. Thus we show how to construct language models with interpretable representations of different information timescales.

翻译：虽然神经语言模型在获取自然语言统计数据方面是有效的,但其表达方式是难以解释的。特别是,不清楚这些模型如何在多个时间尺度内保留信息。在这项工作中,我们通过在长期短期内存(LSTM)网络中操纵输入并忘记门偏差,来建立明确的多时间尺度语言模型。时间尺度的分布选择通过指数衰减的记忆细胞组合来接近自然语言的权力法统计。然后我们用词缩放实验和忘记门可视化来对通过模型每个部分所传递的信息的时间尺度进行实证分析。这些实验表明,多时间尺度模型成功地在理想的时间尺度内学习了演示,而且分布包括比标准的 LSTM 更长的时间尺度。此外,关于高、中、低频词汇的信息会以适当时间尺度的单位为优先选择。因此,我们展示了如何构建语言模型,对不同的信息时间尺度进行解释表达。

0

相关内容

语言模型化

语言模型化

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

专知会员服务

67+阅读 · 2019年12月27日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Compression of Deep Learning Models for Text: A Survey

Compression of Deep Learning Models for Text: A Survey

Arxiv

7+阅读 · 2020年8月12日

Graph Enhanced Representation Learning for News Recommendation

Arxiv

24+阅读 · 2020年3月31日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Deep Learning for Learning Graph Representations

Arxiv

35+阅读 · 2020年1月2日

Hierarchical Contextualized Representation for Named Entity Recognition

Hierarchical Contextualized Representation for Named Entity Recognition

Arxiv

4+阅读 · 2019年11月19日

Learning Context-Sensitive Convolutional Filters for Text Processing

Arxiv

7+阅读 · 2018年8月30日

Inductive Representation Learning on Large Graphs

Arxiv

3+阅读 · 2018年4月10日

Representation Learning for Visual-Relational Knowledge Graphs

Arxiv

9+阅读 · 2018年3月31日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

Improving Visually Grounded Sentence Representations with Self-Attention

Arxiv

8+阅读 · 2017年12月2日

VIP会员

文章信息

相关主题

语言模型化

长短期记忆网络

相关VIP内容

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

专知会员服务

67+阅读 · 2019年12月27日

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

【图机器学习论文】图表示学习:方法与应用（Representation Learning on Graphs: Methods and Applications）

专知会员服务

147+阅读 · 2019年12月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Compression of Deep Learning Models for Text: A Survey

Compression of Deep Learning Models for Text: A Survey

Arxiv

7+阅读 · 2020年8月12日

Graph Enhanced Representation Learning for News Recommendation

Arxiv

24+阅读 · 2020年3月31日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Deep Learning for Learning Graph Representations

Arxiv

35+阅读 · 2020年1月2日

Hierarchical Contextualized Representation for Named Entity Recognition

Hierarchical Contextualized Representation for Named Entity Recognition

Arxiv

4+阅读 · 2019年11月19日

Learning Context-Sensitive Convolutional Filters for Text Processing

Arxiv

7+阅读 · 2018年8月30日

Inductive Representation Learning on Large Graphs

Arxiv

3+阅读 · 2018年4月10日

Representation Learning for Visual-Relational Knowledge Graphs

Arxiv

9+阅读 · 2018年3月31日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

Improving Visually Grounded Sentence Representations with Self-Attention

Arxiv

8+阅读 · 2017年12月2日

微信扫码咨询专知VIP会员