引文:快速和准确的不偏向端对端语音识别平行变换器 (Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition) - 专知论文

会员服务 ·

0

Performer · FAST · 词元分析器 · 语音识别 · 变换 ·

2022 年 6 月 20 日

Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition

翻译：引文:快速和准确的不偏向端对端语音识别平行变换器

Zhifu Gao,Shiliang Zhang,Ian McLoughlin,Zhijie Yan

from arxiv, 5 pages, 3 figures, accepted by INTERSPEECH 2022

Transformers have recently dominated the ASR field. Although able to yield good performance, they involve an autoregressive (AR) decoder to generate tokens one by one, which is computationally inefficient. To speed up inference, non-autoregressive (NAR) methods, e.g. single-step NAR, were designed, to enable parallel generation. However, due to an independence assumption within the output tokens, performance of single-step NAR is inferior to that of AR models, especially with a large-scale corpus. There are two challenges to improving single-step NAR: Firstly to accurately predict the number of output tokens and extract hidden variables; secondly, to enhance modeling of interdependence between output tokens. To tackle both challenges, we propose a fast and accurate parallel transformer, termed Paraformer. This utilizes a continuous integrate-and-fire based predictor to predict the number of tokens and generate hidden variables. A glancing language model (GLM) sampler then generates semantic embeddings to enhance the NAR decoder's ability to model context interdependence. Finally, we design a strategy to generate negative samples for minimum word error rate training to further improve performance. Experiments using the public AISHELL-1, AISHELL-2 benchmark, and an industrial-level 20,000 hour task demonstrate that the proposed Paraformer can attain comparable performance to the state-of-the-art AR transformer, with more than 10x speedup.

翻译：变异器最近占据了 ASR 字段。虽然它能够产生良好的性能, 但它包含一个自动递增解码器, 逐个生成一个符号, 计算效率低。为了加快推断, 非自动递增( NAR) 方法, 比如单步 NAR, 设计了平行生成。但是, 由于在输出符号中有一个独立假设, 单步 NAR 的性能低于AR 模型的性能, 特别是大型的体积。在改进单步 NAR 时, 有两个挑战: 首先准确预测输出符号的数量并提取隐藏变量; 其次, 加强产出符号之间相互依存的建模。为了应对这两个挑战, 我们提出了一个快速和准确的平行变异器, 称为 Paraforect。这使用了基于连续的集成和火的预测器, 单步调式 NARDR( GLM) 样器的性能低于AR 模型, 然后生成语义嵌嵌嵌嵌, 用来加强NAR decoder 的模型在内部模型上进行相互依存的能力。其次, 我们设计了一个可比较性化的性E- LISISAL,, 将一个最低性测试的成绩样本到一个用于最低性工作, 。

0

相关内容

Performer

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

高速铁路轨道伤损的金属磁记忆动态检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

DNA水凝胶-阳离子共轭聚合物杂化荧光纳米粒子的制备及其应用于肿瘤细胞成像与治疗研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀土MOF纳米荧光探针的设计合成及其生物应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于压缩感知的点云数据压缩方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

关于具有奇异参数的偏微分方程边值问题与带双边反射的随机偏微分方程的研究

国家自然科学基金

0+阅读 · 2013年12月31日

高容量金属氧化物/介孔碳复合锂电电极材料的设计及性能

国家自然科学基金

0+阅读 · 2012年12月31日

Internet环境下组合式软件的时空进程代数刻画及模型检测

国家自然科学基金

0+阅读 · 2012年12月31日

《计算机研究与发展》学术期刊

国家自然科学基金

1+阅读 · 2011年12月31日

基于Compressive sensing理论的单探测器太赫兹成像技术

国家自然科学基金

0+阅读 · 2009年12月31日

Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model

Arxiv

0+阅读 · 2022年8月8日

Exploring linguistic feature and model combination for speech recognition based automatic AD detection

Arxiv

0+阅读 · 2022年8月8日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

0+阅读 · 2022年8月5日

Joint Attention-Driven Domain Fusion and Noise-Tolerant Learning for Multi-Source Domain Adaptation

Arxiv

0+阅读 · 2022年8月5日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体任务执行安全要求

AI智能体基础设施

【ICML2025】立场：我们需要对生成式人工智能的算法理解

【斯坦福博士论文】为大型语言模型构建交互学习管道

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model

Arxiv

0+阅读 · 2022年8月8日

Exploring linguistic feature and model combination for speech recognition based automatic AD detection

Arxiv

0+阅读 · 2022年8月8日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

0+阅读 · 2022年8月5日

Joint Attention-Driven Domain Fusion and Noise-Tolerant Learning for Multi-Source Domain Adaptation

Arxiv

0+阅读 · 2022年8月5日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

高速铁路轨道伤损的金属磁记忆动态检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

DNA水凝胶-阳离子共轭聚合物杂化荧光纳米粒子的制备及其应用于肿瘤细胞成像与治疗研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀土MOF纳米荧光探针的设计合成及其生物应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于压缩感知的点云数据压缩方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

关于具有奇异参数的偏微分方程边值问题与带双边反射的随机偏微分方程的研究

国家自然科学基金

0+阅读 · 2013年12月31日

高容量金属氧化物/介孔碳复合锂电电极材料的设计及性能

国家自然科学基金

0+阅读 · 2012年12月31日

Internet环境下组合式软件的时空进程代数刻画及模型检测

国家自然科学基金

0+阅读 · 2012年12月31日

《计算机研究与发展》学术期刊

国家自然科学基金

1+阅读 · 2011年12月31日

基于Compressive sensing理论的单探测器太赫兹成像技术

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员