演讲网:在工业规模上受到微弱监督的、终端到终端语音识别 (SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale) - 专知论文

会员服务 ·

0

语音识别 · 端到端 · 监督 · 缩放 · 推断 ·

2022 年 11 月 21 日

SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale

翻译：演讲网:在工业规模上受到微弱监督的、终端到终端语音识别

Raphael Tang,Karun Kumar,Gefei Yang,Akshat Pandey,Yajie Mao,Vladislav Belyaev,Madhuri Emmadi,Craig Murray,Ferhan Ture,Jimmy Lin

from arxiv, Accepted to EMNLP 2022 Industry Track; 9 pages, 7 figures

End-to-end automatic speech recognition systems represent the state of the art, but they rely on thousands of hours of manually annotated speech for training, as well as heavyweight computation for inference. Of course, this impedes commercialization since most companies lack vast human and computational resources. In this paper, we explore training and deploying an ASR system in the label-scarce, compute-limited setting. To reduce human labor, we use a third-party ASR system as a weak supervision source, supplemented with labeling functions derived from implicit user feedback. To accelerate inference, we propose to route production-time queries across a pool of CUDA graphs of varying input lengths, the distribution of which best matches the traffic's. Compared to our third-party ASR, we achieve a relative improvement in word-error rate of 8% and a speedup of 600%. Our system, called SpeechNet, currently serves 12 million queries per day on our voice-enabled smart television. To our knowledge, this is the first time a large-scale, Wav2vec-based deployment has been described in the academic literature.

翻译：端到端自动语音识别系统代表了艺术的状态, 但是它们依赖数千小时人工附加说明的演讲来进行培训, 以及重量计算推算。当然, 这阻碍商业化, 因为大多数公司缺乏巨大的人力和计算资源。在本文中, 我们探索在标签偏差、计算有限设置中培训和部署 ASR 系统。为了减少人类劳动, 我们使用第三方ASR 系统作为薄弱的监督源, 并辅之以来自隐性用户反馈的标签功能。为了加速推断, 我们提议在 CUDA 图表库中通过不同输入长度的制作时间查询, 其分布最适合交通。与我们的第三方 ASR 相比, 我们的LOVER 系统实现了8% 和 600 % 的相对改进。我们的系统, 称为SpeealNet, 目前在我们的语音智能电视上每天提供1 200万个查询。据我们所知, 这是第一次在学术文献中描述大规模基于Wav2VC的部署。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

重组人磷脂酶D2干预哮喘中Treg特征的研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于双频双极化微波链路的降水类型识别与反演

国家自然科学基金

0+阅读 · 2015年12月31日

皮肤间充质干细胞通过其分泌的sTNFR1抑制Th17细胞的分化

国家自然科学基金

0+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

各向同性和TI弹性波方程高精度有限差分数值解法新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

长波敏化水溶性稀土配合物的设计、合成及双光子荧光细胞显影研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

柴油机相似条件下颗粒/燃料/空气预混火焰动力学及燃烧反应机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

The Effect of Variable Factors on the Handover Performance for Ultra Dense Network

Arxiv

0+阅读 · 2023年1月19日

Neural Network Quantization for Efficient Inference: A Survey

Arxiv

0+阅读 · 2023年1月18日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Arxiv

16+阅读 · 2021年5月26日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型驱动的AI智能体通信综述：协议、安全风险与防御对策

基于BERT和知识图谱的武器装备问答系统

中文版4000字 | 战场人工智能革命尚未到来：当前俄乌AI无人机发展现状

中文版 | 转向防务：硅谷如何谋划接管战争产业

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

The Effect of Variable Factors on the Handover Performance for Ultra Dense Network

Arxiv

0+阅读 · 2023年1月19日

Neural Network Quantization for Efficient Inference: A Survey

Arxiv

0+阅读 · 2023年1月18日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Arxiv

16+阅读 · 2021年5月26日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

重组人磷脂酶D2干预哮喘中Treg特征的研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于双频双极化微波链路的降水类型识别与反演

国家自然科学基金

0+阅读 · 2015年12月31日

皮肤间充质干细胞通过其分泌的sTNFR1抑制Th17细胞的分化

国家自然科学基金

0+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

各向同性和TI弹性波方程高精度有限差分数值解法新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

长波敏化水溶性稀土配合物的设计、合成及双光子荧光细胞显影研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

柴油机相似条件下颗粒/燃料/空气预混火焰动力学及燃烧反应机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员