双数贝塔LSTM (Bivariate Beta LSTM) - 专知论文

会员服务 ·

0

贝塔分布 · Sigmoid函数 · Sigmoid（一种激活函数） · 长短期记忆网络 · MoDELS ·

2019 年 10 月 7 日

Bivariate Beta LSTM

翻译：双数贝塔LSTM

Kyungwoo Song,JoonHo Jang,Seung jae Shin,Il-Chul Moon

Long Short-Term Memory (LSTM) infers the long term dependency through a cell state maintained by the input and the forget gate structures, which models a gate output as a value in [0,1] through a sigmoid function. However, due to the graduality of the sigmoid function, the sigmoid gate is not flexible in representing multi-modality or skewness. Besides, the previous models lack modeling on the correlation between the gates, which would be a new method to adopt inductive bias for a relationship between previous and current input. This paper proposes a new gate structure with the bivariate Beta distribution. The proposed gate structure enables probabilistic modeling on the gates within the LSTM cell so that the modelers can customize the cell state flow with priors and distributions. Moreover, we theoretically show the higher upper bound of the gradient compared to the sigmoid function, and we empirically observed that the bivariate Beta distribution gate structure provides higher gradient values in training. We demonstrate the effectiveness of bivariate Beta gate structure on the sentence classification, image classification, polyphonic music modeling, and image caption generation.

翻译：长期短期内存( LSTM) 通过输入和遗忘门结构所维护的单元格状态来推断长期依赖性, 以输入和遗忘门结构所维护的单元格状态来模拟门输出值, 以 [0, 1 以数值为 [0], 以示象形函数为模型。但是, 由于Sigmoid 函数的渐进性, sigmoid门在代表多式或扭曲性时不具有灵活性。此外, 以前的模型缺乏对门之间相关性的模型, 这将是对先前和当前输入之间的关系采用感应偏差的新方法。本文建议使用双变量 Beta 分布的新门结构。拟议的门结构使得在 LSTM 单元格的门上能够进行概率性建模, 以便模型能够根据先前和分布来定制细胞的状态流。此外, 我们理论上展示了梯度与象函数功能相比较高的上限, 我们从经验中观察到, 双变量Beta 分发门结构在培训中提供了更高的梯度值。我们展示了在句分类、图像分类、多式音乐模型和图像生成图象上的效能。

6

相关内容

贝塔分布

【斯坦福大学AI】BERT, ELMo， & GPT-2:上下文化的单词表示是怎样的?

专知会员服务

35+阅读 · 2020年3月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【Google-普林斯顿】从学习速率中解开自适应梯度法，Disentangling Adaptive Gradient

专知会员服务

19+阅读 · 2020年3月5日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

专知会员服务

23+阅读 · 2020年1月28日

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

专知会员服务

13+阅读 · 2019年12月13日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 2

vae 相关论文表示学习 2

CreateAMind

6+阅读 · 2018年9月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

Simple Recurrent Unit For Sentence Classification

Simple Recurrent Unit For Sentence Classification

哈工大SCIR

6+阅读 · 2017年11月29日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

Arxiv

36+阅读 · 2020年5月24日

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching

Arxiv

8+阅读 · 2019年10月9日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Arxiv

4+阅读 · 2018年11月21日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

Aspect Based Sentiment Analysis with Gated Convolutional Networks

Arxiv

12+阅读 · 2018年5月18日

Wasserstein Auto-Encoders

Arxiv

7+阅读 · 2018年3月12日

Activation Maximization Generative Adversarial Nets

Arxiv

5+阅读 · 2018年1月30日

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Arxiv

4+阅读 · 2018年1月30日

Depth-Gated LSTM

Arxiv

4+阅读 · 2015年8月25日

VIP会员

文章信息

相关主题

Sigmoid（一种激活函数）

长短期记忆网络

相关VIP内容

【斯坦福大学AI】BERT, ELMo， & GPT-2:上下文化的单词表示是怎样的?

专知会员服务

35+阅读 · 2020年3月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【Google-普林斯顿】从学习速率中解开自适应梯度法，Disentangling Adaptive Gradient

专知会员服务

19+阅读 · 2020年3月5日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

【反馈循环自编码器】FEEDBACK RECURRENT AUTOENCODER

专知会员服务

23+阅读 · 2020年1月28日

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

专知会员服务

13+阅读 · 2019年12月13日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身操作的视觉-语言-动作模型综述

《多域空战指挥体系：驾驭复杂性的艺术》

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

相关资讯

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 2

vae 相关论文表示学习 2

CreateAMind

6+阅读 · 2018年9月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

Simple Recurrent Unit For Sentence Classification

Simple Recurrent Unit For Sentence Classification

哈工大SCIR

6+阅读 · 2017年11月29日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

Arxiv

36+阅读 · 2020年5月24日

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching

Arxiv

8+阅读 · 2019年10月9日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Arxiv

4+阅读 · 2018年11月21日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

Aspect Based Sentiment Analysis with Gated Convolutional Networks

Arxiv

12+阅读 · 2018年5月18日

Wasserstein Auto-Encoders

Arxiv

7+阅读 · 2018年3月12日

Activation Maximization Generative Adversarial Nets

Arxiv

5+阅读 · 2018年1月30日

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Arxiv

4+阅读 · 2018年1月30日

Depth-Gated LSTM

Arxiv

4+阅读 · 2015年8月25日

微信扫码咨询专知VIP会员