使用树木训练指针生成器将上下文 ASR 最小化风险单词错误最小化 (Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator) - 专知论文

会员服务 ·

0

有偏 · 语音识别 · 可约的 · 语言模型化 · 知识 (knowledge) ·

2022 年 5 月 23 日

Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator

翻译：使用树木训练指针生成器将上下文 ASR 最小化风险单词错误最小化

Guangzhi Sun,Chao Zhang,Philip C Woodland

from arxiv, This work has been submitted to the IEEE Transactions on Audio, Speech, and Language Processing for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Contextual knowledge is essential for reducing speech recognition errors on high-valued long-tail words. This paper proposes a novel tree-constrained pointer generator (TCPGen) component that enables end-to-end ASR models to bias towards a list of long-tail words obtained using external contextual information. With only a small overhead in memory use and computation cost, TCPGen can structure thousands of biasing words efficiently into a symbolic prefix-tree and creates a neural shortcut between the tree and the final ASR output to facilitate the recognition of the biasing words. To enhance TCPGen, we further propose a novel minimum biasing word error (MBWE) loss that directly optimises biasing word errors during training, along with a biasing-word-driven language model discounting (BLMD) method during the test. All contextual ASR systems were evaluated on the public Librispeech audiobook corpus and the data from the dialogue state tracking challenges (DSTC) with the biasing lists extracted from the dialogue-system ontology. Consistent word error rate (WER) reductions were achieved with TCPGen, which were particularly significant on the biasing words with around 40\% relative reductions in the recognition error rates. MBWE and BLMD further improved the effectiveness of TCPGen and achieved more significant WER reductions on the biasing words. TCPGen also achieved zero-shot learning of words not in the audio training set with large WER reductions on the out-of-vocabulary words in the biasing list.

翻译：环境知识对于减少高价值长尾单词的语音识别错误至关重要。本文提议了一个新颖的树节指针生成器( TCPGen), 使端到端的 ASR 模型能够偏向于使用外部背景信息获得的长尾单词列表。由于记忆使用和计算成本方面的管理费用很小, TCPGen 能够将数千个偏差词有效地组织成一个象征性的前缀树, 并在树和最后的 ASR 输出之间创造神经捷径, 以便于识别偏向词。为了加强 TCPGen, 我们进一步提议了一个新颖的最小偏差字错误( MBWWWWW) 错误( MWWWWE), 使培训中偏差字的偏差直接产生偏差, 以及测试中偏差语言模型( BLMD) 的折扣方法。所有背景的 ASR 系统都通过公共 Librispeech 音箱和对话状态跟踪挑战的数据( DSTC) 和从对话- 系统取出的偏差列表中得出偏差单词的列表, 与TCPG 在TCPG 的降为显著的减幅上取得了显著的偏差值。在TERG 列中,在TBLLL 的减幅上取得了显著的减率。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

肺内皮细胞S1PR1受体在流感病毒所致ARDS中的作用

国家自然科学基金

1+阅读 · 2014年12月31日

PCL聚合物纳米粒子控释HIF-1α诱导OSTERIX修饰的iPS细胞成骨作用及再血管化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

鱼类ADAR1剪接异构体基因的鉴定及其转录调控

国家自然科学基金

0+阅读 · 2012年12月31日

SphK-1/S1P信号通路介导糖尿病肾脏纤维化的作用与机制

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

近邻星系的中远红外性质

国家自然科学基金

0+阅读 · 2011年12月31日

趋化因子CCL2和CX3CL1在泰素诱导触诱发痛中的作用及机制

国家自然科学基金

0+阅读 · 2010年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

On the Age of Information for AMP based Grant-Free Random Access

Arxiv

0+阅读 · 2022年7月11日

Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation

Arxiv

0+阅读 · 2022年7月10日

Active Learning for Contextual Search with Binary Feedbacks

Arxiv

0+阅读 · 2022年7月9日

OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering

Arxiv

0+阅读 · 2022年7月8日

Data-driven Numerical Invariant Synthesis with Automatic Generation of Attributes

Arxiv

0+阅读 · 2022年7月7日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

语言模型化

知识 (knowledge)

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

On the Age of Information for AMP based Grant-Free Random Access

Arxiv

0+阅读 · 2022年7月11日

Towards Proper Contrastive Self-supervised Learning Strategies For Music Audio Representation

Arxiv

0+阅读 · 2022年7月10日

Active Learning for Contextual Search with Binary Feedbacks

Arxiv

0+阅读 · 2022年7月9日

OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering

Arxiv

0+阅读 · 2022年7月8日

Data-driven Numerical Invariant Synthesis with Automatic Generation of Attributes

Arxiv

0+阅读 · 2022年7月7日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

相关基金

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

肺内皮细胞S1PR1受体在流感病毒所致ARDS中的作用

国家自然科学基金

1+阅读 · 2014年12月31日

PCL聚合物纳米粒子控释HIF-1α诱导OSTERIX修饰的iPS细胞成骨作用及再血管化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

鱼类ADAR1剪接异构体基因的鉴定及其转录调控

国家自然科学基金

0+阅读 · 2012年12月31日

SphK-1/S1P信号通路介导糖尿病肾脏纤维化的作用与机制

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

近邻星系的中远红外性质

国家自然科学基金

0+阅读 · 2011年12月31日

趋化因子CCL2和CX3CL1在泰素诱导触诱发痛中的作用及机制

国家自然科学基金

0+阅读 · 2010年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员