RNNN-T 绑定和缩小解码器 (Tied & Reduced RNN-T Decoder) - 专知论文

会员服务 ·

0

可约的 · Weight · Networking · 解码 · SimPLe ·

2021 年 9 月 15 日

Tied & Reduced RNN-T Decoder

翻译：RNNN-T 绑定和缩小解码器

Rami Botros,Tara N. Sainath,Robert David,Emmanuel Guzman,Wei Li,Yanzhang He

Previous works on the Recurrent Neural Network-Transducer (RNN-T) models have shown that, under some conditions, it is possible to simplify its prediction network with little or no loss in recognition accuracy (arXiv:2003.07705 [eess.AS], [2], arXiv:2012.06749 [cs.CL]). This is done by limiting the context size of previous labels and/or using a simpler architecture for its layers instead of LSTMs. The benefits of such changes include reduction in model size, faster inference and power savings, which are all useful for on-device applications. In this work, we study ways to make the RNN-T decoder (prediction network + joint network) smaller and faster without degradation in recognition performance. Our prediction network performs a simple weighted averaging of the input embeddings, and shares its embedding matrix weights with the joint network's output layer (a.k.a. weight tying, commonly used in language modeling arXiv:1611.01462 [cs.LG]). This simple design, when used in conjunction with additional Edit-based Minimum Bayes Risk (EMBR) training, reduces the RNN-T Decoder from 23M parameters to just 2M, without affecting word-error rate (WER).

翻译：常规神经网络-传输(RNN-T)模型的以往工作表明,在某些条件下,有可能简化其预测网络,在识别准确性方面少少少少少少少少少少亏少(arXiv:2003.07705 [ees.AS],[2],arXiv:2012.06749 [cs.CL])),其方法是限制先前标签的上下文大小,和(或)使用较简单的结构结构,而不是LSTMS, 这样做的好处包括缩小模型大小、加快推导速度和节能,所有这些都对在线应用有用。在这项工作中,我们研究如何使RNNN-T的解码(定位网络+联合网络)更小、更快,而不会在识别性性性能方面出现退化。我们的预测网络对输入嵌入进行简单的加权,并将其嵌入式矩阵重量与联合网络的输出层(a.k.a.a.重量搭配,通常用于对arXiv:161.01462 [c.LG] 进行模拟的语言模拟。我们研究如何使RNNNN-M-DER 降低标准标准的最低限度设计,这种简单的最低限度,在使用时影响到VIDRM-BER-deal-deal-dexxxxx-dex-dexxxxxx

0

相关内容

可约的

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

图节点嵌入(Node Embeddings)概述，9页pdf

图节点嵌入(Node Embeddings)概述，9页pdf

专知会员服务

40+阅读 · 2020年8月22日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【论文推荐】 Bidirectional Self-Normalizing Neural Networks：双向自归一化神经网络

【论文推荐】 Bidirectional Self-Normalizing Neural Networks：双向自归一化神经网络

专知会员服务

17+阅读 · 2020年6月22日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

机器翻译 | Bleu：此蓝;非彼蓝

机器翻译 | Bleu：此蓝;非彼蓝

黑龙江大学自然语言处理实验室

4+阅读 · 2018年3月14日

从 Encoder 到 Decoder 实现 Seq2Seq 模型

从 Encoder 到 Decoder 实现 Seq2Seq 模型

AI研习社

10+阅读 · 2018年2月10日

【推荐】RNN无损压缩方法DeepZip（附代码）

【推荐】RNN无损压缩方法DeepZip（附代码）

机器学习研究会

5+阅读 · 2018年1月1日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

量化投资与机器学习

7+阅读 · 2017年11月22日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Analyzing Architectures for Neural Machine Translation Using Low Computational Resources

Arxiv

0+阅读 · 2021年11月6日

Decoder Ties Do Not Affect the Error Exponent of the Memoryless Binary Symmetric Channel

Arxiv

0+阅读 · 2021年11月3日

How Framelets Enhance Graph Neural Networks

Arxiv

21+阅读 · 2021年2月13日

Do RNN and LSTM have Long Memory?

Do RNN and LSTM have Long Memory?

Arxiv

19+阅读 · 2020年6月10日

Hierarchically-Refined Label Attention Network for Sequence Labeling

Hierarchically-Refined Label Attention Network for Sequence Labeling

Arxiv

3+阅读 · 2019年8月23日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

Handling Homographs in Neural Machine Translation

Arxiv

3+阅读 · 2018年3月28日

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Arxiv

8+阅读 · 2018年2月7日

VIP会员

文章信息

相关主题

相关VIP内容

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

图节点嵌入(Node Embeddings)概述，9页pdf

图节点嵌入(Node Embeddings)概述，9页pdf

专知会员服务

40+阅读 · 2020年8月22日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【论文推荐】 Bidirectional Self-Normalizing Neural Networks：双向自归一化神经网络

【论文推荐】 Bidirectional Self-Normalizing Neural Networks：双向自归一化神经网络

专知会员服务

17+阅读 · 2020年6月22日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

机器翻译 | Bleu：此蓝;非彼蓝

机器翻译 | Bleu：此蓝;非彼蓝

黑龙江大学自然语言处理实验室

4+阅读 · 2018年3月14日

从 Encoder 到 Decoder 实现 Seq2Seq 模型

从 Encoder 到 Decoder 实现 Seq2Seq 模型

AI研习社

10+阅读 · 2018年2月10日

【推荐】RNN无损压缩方法DeepZip（附代码）

【推荐】RNN无损压缩方法DeepZip（附代码）

机器学习研究会

5+阅读 · 2018年1月1日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

量化投资与机器学习

7+阅读 · 2017年11月22日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Analyzing Architectures for Neural Machine Translation Using Low Computational Resources

Arxiv

0+阅读 · 2021年11月6日

Decoder Ties Do Not Affect the Error Exponent of the Memoryless Binary Symmetric Channel

Arxiv

0+阅读 · 2021年11月3日

How Framelets Enhance Graph Neural Networks

Arxiv

21+阅读 · 2021年2月13日

Do RNN and LSTM have Long Memory?

Do RNN and LSTM have Long Memory?

Arxiv

19+阅读 · 2020年6月10日

Hierarchically-Refined Label Attention Network for Sequence Labeling

Hierarchically-Refined Label Attention Network for Sequence Labeling

Arxiv

3+阅读 · 2019年8月23日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

Handling Homographs in Neural Machine Translation

Arxiv

3+阅读 · 2018年3月28日

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Arxiv

8+阅读 · 2018年2月7日

微信扫码咨询专知VIP会员