关于多发言者语音识别系统的单词错误率定义及其有效计算 (On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems) - 专知论文

会员服务 ·

0

错误率 · 语音识别 · MIMO · dynamic programming · 输出 ·

2022 年 11 月 29 日

On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems

翻译：关于多发言者语音识别系统的单词错误率定义及其有效计算

Thilo von Neumann,Christoph Boeddeker,Keisuke Kinoshita,Marc Delcroix,Reinhold Haeb-Umbach

from arxiv, Submitted to ICASSP 2023

We present a general framework to compute the word error rate (WER) of ASR systems that process recordings containing multiple speakers at their input and that produce multiple output word sequences (MIMO). Such ASR systems are typically required, e.g., for meeting transcription. We provide an efficient implementation based on a dynamic programming search in a multi-dimensional Levenshtein distance tensor under the constraint that a reference utterance must be matched consistently with one hypothesis output. This also results in an efficient implementation of the ORC WER which previously suffered from exponential complexity. We give an overview of commonly used WER definitions for multi-speaker scenarios and show that they are specializations of the above MIMO WER tuned to particular application scenarios. We conclude with a discussion of the pros and cons of the various WER definitions and a recommendation when to use which.

翻译：我们提出了一个总框架,用于计算ASR系统的单词错误率(WER),这些系统在输入时处理包含多个发言者的录音,并产生多个输出字序列(MIMO),这种ASR系统通常是需要的,例如用于会议抄录。我们在多维Levenshtein距离的动态编程搜索基础上,提供了高效的实施,其制约是参考语必须同一个假设输出相匹配。这也导致有效采用ORC WER, 而这以前曾受到指数复杂性的影响。我们概述了多语音情景中通常使用的 WER定义,并表明这些定义是以上IMO WER专门适应特定应用情景的。我们最后讨论了各种WER定义的利弊,并建议使用哪一种。

0

相关内容

错误率

指分类错误的样本数占样本总数的比例。

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

TRIM33在表观遗传水平上对TGF-β信号通路的调控

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于时频二维训练信息的高谱效多天线TFT-OFDM技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

自噬相关基因Atg7调控血管内皮细胞参与创面血管新生的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

量子点超辐射发光管材料与器件研究

国家自然科学基金

0+阅读 · 2008年12月31日

A proof system for graph (non)-isomorphism verification

Arxiv

0+阅读 · 2023年1月31日

Exact and Heuristic Approaches to Speeding Up the MSM Time Series Distance Computation

Arxiv

0+阅读 · 2023年1月31日

Exact linear reductions of dynamical models

Arxiv

0+阅读 · 2023年1月27日

Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions

Arxiv

31+阅读 · 2021年9月27日

A Survey on Dialogue Systems: Recent Advances and New Frontiers

Arxiv

11+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

dynamic programming

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型中的事件抽取：方法、模态与未来展望的全面综述

美海军作战管理系统：变革战场空间的二十年

【MIT博士论文】以语言为中心的医学影像理解

俄罗斯“沙希德”/“天竺葵”攻击无人机

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A proof system for graph (non)-isomorphism verification

Arxiv

0+阅读 · 2023年1月31日

Exact and Heuristic Approaches to Speeding Up the MSM Time Series Distance Computation

Arxiv

0+阅读 · 2023年1月31日

Exact linear reductions of dynamical models

Arxiv

0+阅读 · 2023年1月27日

Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions

Arxiv

31+阅读 · 2021年9月27日

A Survey on Dialogue Systems: Recent Advances and New Frontiers

Arxiv

11+阅读 · 2018年1月11日

相关基金

TRIM33在表观遗传水平上对TGF-β信号通路的调控

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于时频二维训练信息的高谱效多天线TFT-OFDM技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

自噬相关基因Atg7调控血管内皮细胞参与创面血管新生的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

量子点超辐射发光管材料与器件研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员