WeNet2.0:更有成效的终端至终端语音识别工具包 (WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit) - 专知论文

会员服务 ·

0

Performer · Attention · 语音识别 · 端到端 · 解码 ·

2022 年 7 月 5 日

WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

翻译：WeNet2.0:更有成效的终端至终端语音识别工具包

Binbin Zhang,Di Wu,Zhendong Peng,Xingchen Song,Zhuoyuan Yao,Hang Lv,Lei Xie,Chao Yang,Fuping Pan,Jianwei Niu

Recently, we made available WeNet, a production-oriented end-to-end speech recognition toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address the streaming and non-streaming decoding modes in a single model. To further improve ASR performance and facilitate various production requirements, in this paper, we present WeNet 2.0 with four important updates. (1) We propose U2++, a unified two-pass framework with bidirectional attention decoders, which includes the future contextual information by a right-to-left attention decoder to improve the representative ability of the shared encoder and the performance during the rescoring stage. (2) We introduce an n-gram based language model and a WFST-based decoder into WeNet 2.0, promoting the use of rich text data in production scenarios. (3) We design a unified contextual biasing framework, which leverages user-specific context (e.g., contact lists) to provide rapid adaptation ability for production and improves ASR accuracy in both with-LM and without-LM scenarios. (4) We design a unified IO to support large-scale data for effective model training. In summary, the brand-new WeNet 2.0 achieves up to 10\% relative recognition performance improvement over the original WeNet on various corpora and makes available several important production-oriented features.

翻译：最近,我们提供了WENet,这是一个面向生产、端到端的语音识别工具,它引入了统一的双通道(U2)框架和一个内在的运行时间,以解决单一模式中的流式和非流式解码模式;为了进一步改进ASR性能,便利各种生产要求,我们在本文件中提供了WeNet 2.0,并提供了四项重要更新:(1) 我们提出U2+++,这是一个统一的双向关注双向关注双向访问双向双向访问双向访问的双向访问框架,其中包括通过右向左关注解码器的未来背景信息,以提高共享编码器的代表性能力和重整阶段的性能。 (2) 我们采用基于ng的语文模型和基于WFFST的解码器进入WNet2.0,促进在生产情景中使用丰富的文本数据。 (3) 我们设计一个统一的背景偏差框架,利用用户特有的背景(例如联系名单),为生产提供快速适应能力,提高ASR在LM和无LM情景下的准确度。 (4) 我们设计一个统一的IO-O,以新的网络模式支持大规模更新现有业绩模型。

1

相关内容

Performer

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

泰莱蛛科：全球物种分化及分布格局的成因

国家自然科学基金

0+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

YAP2在神经祖细胞增殖维持和分化中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

条斑紫菜渗透压胁迫耐受相关eQTL定位及调控网络解析

国家自然科学基金

0+阅读 · 2012年12月31日

蓝宝石图形衬底上MOCVD定向控制生长半极性（11-22）GaN研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于结构化统计声学模型的非平行语料非联合训练说话人语音转换研究

国家自然科学基金

0+阅读 · 2012年12月31日

组织工程化桥接神经元治疗脊髓损伤

国家自然科学基金

0+阅读 · 2011年12月31日

新型半导体激光器及多功能光子集成基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于Surfacelet多尺度积的三维SAR图像去噪与分割

国家自然科学基金

0+阅读 · 2009年12月31日

Rail break and derailment prediction using Probabilistic Graphical Modelling

Arxiv

0+阅读 · 2022年8月25日

Comprehensive Dataset of Face Manipulations for Development and Evaluation of Forensic Tools

Arxiv

0+阅读 · 2022年8月24日

Flat Multi-modal Interaction Transformer for Named Entity Recognition

Flat Multi-modal Interaction Transformer for Named Entity Recognition

Arxiv

0+阅读 · 2022年8月23日

Integrating Lattice-Free MMI into End-to-End Speech Recognition

Arxiv

0+阅读 · 2022年8月23日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Rail break and derailment prediction using Probabilistic Graphical Modelling

Arxiv

0+阅读 · 2022年8月25日

Comprehensive Dataset of Face Manipulations for Development and Evaluation of Forensic Tools

Arxiv

0+阅读 · 2022年8月24日

Flat Multi-modal Interaction Transformer for Named Entity Recognition

Flat Multi-modal Interaction Transformer for Named Entity Recognition

Arxiv

0+阅读 · 2022年8月23日

Integrating Lattice-Free MMI into End-to-End Speech Recognition

Arxiv

0+阅读 · 2022年8月23日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

泰莱蛛科：全球物种分化及分布格局的成因

国家自然科学基金

0+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

YAP2在神经祖细胞增殖维持和分化中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

条斑紫菜渗透压胁迫耐受相关eQTL定位及调控网络解析

国家自然科学基金

0+阅读 · 2012年12月31日

蓝宝石图形衬底上MOCVD定向控制生长半极性（11-22）GaN研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于结构化统计声学模型的非平行语料非联合训练说话人语音转换研究

国家自然科学基金

0+阅读 · 2012年12月31日

组织工程化桥接神经元治疗脊髓损伤

国家自然科学基金

0+阅读 · 2011年12月31日

新型半导体激光器及多功能光子集成基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于Surfacelet多尺度积的三维SAR图像去噪与分割

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员