使用BERT的基于端对端 ASR的非航空式变压器 (Non-autoregressive Transformer-based End-to-end ASR using BERT) - 专知论文

会员服务 ·

0

语音识别 · Performer · BERT · MoDELS · 端到端 ·

2022 年 4 月 18 日

Non-autoregressive Transformer-based End-to-end ASR using BERT

翻译：使用BERT的基于端对端 ASR的非航空式变压器

Fu-Hao Yu,Kuan-Yu Chen

Transformer-based models have led to significant innovation in classical and practical subjects as varied as speech processing, natural language processing, and computer vision. On top of the Transformer, attention-based end-to-end automatic speech recognition (ASR) models have recently become popular. Specifically, non-autoregressive modeling, which boasts fast inference and performance comparable to conventional autoregressive methods, is an emerging research topic. In the context of natural language processing, the bidirectional encoder representations from Transformers (BERT) model has received widespread attention, partially due to its ability to infer contextualized word representations and to enable superior performance for downstream tasks while needing only simple fine-tuning. Motivated by the success, we intend to view speech recognition as a downstream task of BERT, thus an ASR system is expected to be deduced by performing fine-tuning. Consequently, to not only inherit the advantages of non-autoregressive ASR models but also enjoy the benefits of a pre-trained language model (e.g., BERT), we propose a non-autoregressive Transformer-based end-to-end ASR model based on BERT. We conduct a series of experiments on the AISHELL-1 dataset that demonstrate competitive or superior results for the model when compared to state-of-the-art ASR systems.

翻译：以变换器为基础的模型在语言处理、自然语言处理和计算机视觉等传统和实用科目上产生了重大创新。在变换器之外,基于关注的端对端自动语音识别(ASR)模型最近也变得很受欢迎。具体地说,非偏向型模型具有快速推论和性能可与常规自动反向方法相比的特征,是一个新出现的研究课题。在自然语言处理方面,来自变换器模型的双向编码显示得到了广泛的关注,部分是由于它能够推断背景化的文字表达方式,使下游任务能够有更高的性能,而只需要简单的微调。我们打算将语音识别视为BERT的下游任务,因此,预期通过进行微调来推导出一个ASR系统。因此,不仅继承了非偏向型ASR模型的优势,而且还享受了预先培训的语言模型(例如,BERT)的好处,我们提议在基于A-RISA的升级最终测试模型上,而不是基于WERA-RA-A-RA-A-A-ART-ART-S-ART-S-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-ART-S-S-S-S-S-S-ART-ART-ART-ART-S-S-S-S-S-S-AV-AV-AV-S-S-AV-S-S-AV-S-AV-AV-S-S-AV-S-S-AV-AV-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-AD-AD-AD-S-AD-AD-AD-AD-ANS-ANS-AD-AD-AD-AD-AD-ANS-AD-ANS-ANS-AD-AD-AD-AD-S-ANS-AD-AD-AD-AD-AD-AD-AD-AD-S-A

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

基于调度采样的网络化系统分布式控制策略研究

国家自然科学基金

0+阅读 · 2015年12月31日

掺Er3+光纤激光内腔光声光谱气体传感的研究

国家自然科学基金

0+阅读 · 2014年12月31日

拓扑量子边界态和界面态的输运性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

高频GNSS单点测速数据提取海浪参数方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向高速并行向量-矩阵乘法运算的光学数字信号处理关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

企业信息系统实施对员工工作绩效的影响机制研究-基于工作特征和动机的视角

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥CML24参与花粉萌发及花粉管极性生长的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

非局域性蒸馏

国家自然科学基金

0+阅读 · 2012年12月31日

天文望远镜数字全息式自适应光学技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

红外量子点组装介孔AlPO4玻璃及光纤：新型宽调谐光通讯材料

国家自然科学基金

0+阅读 · 2012年12月31日

Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images

Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images

Arxiv

0+阅读 · 2022年6月6日

Extreme Compression for Pre-trained Transformers Made Simple and Efficient

Arxiv

0+阅读 · 2022年6月4日

Anomaly detection in surveillance videos using transformer based attention model

Anomaly detection in surveillance videos using transformer based attention model

Arxiv

0+阅读 · 2022年6月3日

Transformer-Based Self-Supervised Learning for Emotion Recognition

Arxiv

0+阅读 · 2022年6月3日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Commonsense Knowledge Base Completion with Structural and Semantic Context

Commonsense Knowledge Base Completion with Structural and Semantic Context

Arxiv

20+阅读 · 2019年12月19日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation

Arxiv

16+阅读 · 2018年5月10日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

相关论文

Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images

Dual Decomposition of Convex Optimization Layers for Consistent Attention in Medical Images

Arxiv

0+阅读 · 2022年6月6日

Extreme Compression for Pre-trained Transformers Made Simple and Efficient

Arxiv

0+阅读 · 2022年6月4日

Anomaly detection in surveillance videos using transformer based attention model

Anomaly detection in surveillance videos using transformer based attention model

Arxiv

0+阅读 · 2022年6月3日

Transformer-Based Self-Supervised Learning for Emotion Recognition

Arxiv

0+阅读 · 2022年6月3日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Commonsense Knowledge Base Completion with Structural and Semantic Context

Commonsense Knowledge Base Completion with Structural and Semantic Context

Arxiv

20+阅读 · 2019年12月19日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation

Arxiv

16+阅读 · 2018年5月10日

相关基金

基于调度采样的网络化系统分布式控制策略研究

国家自然科学基金

0+阅读 · 2015年12月31日

掺Er3+光纤激光内腔光声光谱气体传感的研究

国家自然科学基金

0+阅读 · 2014年12月31日

拓扑量子边界态和界面态的输运性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

高频GNSS单点测速数据提取海浪参数方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向高速并行向量-矩阵乘法运算的光学数字信号处理关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

企业信息系统实施对员工工作绩效的影响机制研究-基于工作特征和动机的视角

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥CML24参与花粉萌发及花粉管极性生长的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

非局域性蒸馏

国家自然科学基金

0+阅读 · 2012年12月31日

天文望远镜数字全息式自适应光学技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

红外量子点组装介孔AlPO4玻璃及光纤：新型宽调谐光通讯材料

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员