关于多党会议中议长所属的自动语音承认的比较研究 (A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings) - 专知论文

会员服务 ·

0

自动语音识别 · 语音识别 · 可辨认的 · Attention · MoDELS ·

2022 年 7 月 1 日

A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings

翻译：关于多党会议中议长所属的自动语音承认的比较研究

Fan Yu,Zhihao Du,Shiliang Zhang,Yuxiao Lin,Lei Xie

from arxiv, accepted by INTERSPEECH 2022, 5 pages, 2 figures

In this paper, we conduct a comparative study on speaker-attributed automatic speech recognition (SA-ASR) in the multi-party meeting scenario, a topic with increasing attention in meeting rich transcription. Specifically, three approaches are evaluated in this study. The first approach, FD-SOT, consists of a frame-level diarization model to identify speakers and a multi-talker ASR to recognize utterances. The speaker-attributed transcriptions are obtained by aligning the diarization results and recognized hypotheses. However, such an alignment strategy may suffer from erroneous timestamps due to the modular independence, severely hindering the model performance. Therefore, we propose the second approach, WD-SOT, to address alignment errors by introducing a word-level diarization model, which can get rid of such timestamp alignment dependency. To further mitigate the alignment issues, we propose the third approach, TS-ASR, which trains a target-speaker separation module and an ASR module jointly. By comparing various strategies for each SA-ASR approach, experimental results on a real meeting scenario corpus, AliMeeting, reveal that the WD-SOT approach achieves 10.7% relative reduction on averaged speaker-dependent character error rate (SD-CER), compared with the FD-SOT approach. In addition, the TS-ASR approach also outperforms the FD-SOT approach and brings 16.5% relative average SD-CER reduction.

翻译：在本文中,我们就多方会议设想中由发言者提供的自动语音识别(SA-ASR)进行了一项比较研究,这是一个在满足丰富抄录方面日益受到重视的专题,在多党会议设想中,该主题日益受到关注。具体地说,本研究报告评价了三种方法。第一种方法,即FD-SOT, 包括一个框架级分解模式,用以识别发言者和多对话者ASR, 以识别发音; 由发言者提供的分解代码通过调整分解结果和公认的假设获得。然而,由于模块独立性,这种统一战略可能会因错误的时间戳错而受到影响,严重妨碍示范性业绩。因此,我们建议采用第二种方法,即WD-SOT, 通过采用字级分解模式,解决调整错误。为了进一步缓解调和问题,我们提出了第三个方法,即TS-ASR, 培训一个目标分解模块和一个ASR模块。通过比较每一种SA-ASR方法,在实际会议设想中实验结果,AliM-SDFS-SDA, 比较性减少率方法在10-SD-SD-SDM-SDFD方法中还显示,在降低平均方法中也显示了相对-OD-SD-SD-SD-SD-SDM-SD-SD-SD-SD-SD-SDM-S-S-SD-S-S-SD-SD-SD-S-S-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-SD-S-SD-SD-SD-S-S-S-S-S-S-S-S-S-S-S-S-SD-S-S-SD-SD-S-S-SD-SD-SD-SDM-SDM-SDM-SD-SD-SD-SD-SD-S-S-S-S-S-S-S-S-SD-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-

0

相关内容

自动语音识别

自动语音识别

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

ATP13A2基因亚型Ala746Thr和Thr12met突变与新疆维吾尔族早发型和家族型帕金森病临床的相关研究

国家自然科学基金

0+阅读 · 2014年12月31日

巴旦木脱青皮机理分析及实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

超材料表面实现透波隐身的机理及实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

功率SiC-JBS器件可靠性机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

听力基因prestin在回声定位哺乳动物中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

高灵敏度自适应数字多波束导航抗干扰技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

Prmt5在鱼类胚胎和生殖细胞发育中的作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

《物理》期刊

国家自然科学基金

1+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

MatchNorm: Learning-based Point Cloud Registration for 6D Object Pose Estimation in the Real World

Arxiv

1+阅读 · 2022年8月23日

Fusing Higher-order Features in Graph Neural Networks for Skeleton-based Action Recognition

Arxiv

0+阅读 · 2022年8月23日

A Comparative Study of Speaker Role Identification in Air Traffic Communication Using Deep Learning Approaches

Arxiv

0+阅读 · 2022年8月22日

SeNMFk-SPLIT: Large Corpora Topic Modeling by Semantic Non-negative Matrix Factorization with Automatic Model Selection

Arxiv

0+阅读 · 2022年8月21日

Learning Speaker-specific Lip-to-Speech Generation

Arxiv

0+阅读 · 2022年8月20日

Aspect-based Sentiment Classification with Sequential Cross-modal Semantic Graph

Arxiv

0+阅读 · 2022年8月19日

Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition

Arxiv

0+阅读 · 2022年8月19日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

相关论文

MatchNorm: Learning-based Point Cloud Registration for 6D Object Pose Estimation in the Real World

Arxiv

1+阅读 · 2022年8月23日

Fusing Higher-order Features in Graph Neural Networks for Skeleton-based Action Recognition

Arxiv

0+阅读 · 2022年8月23日

A Comparative Study of Speaker Role Identification in Air Traffic Communication Using Deep Learning Approaches

Arxiv

0+阅读 · 2022年8月22日

SeNMFk-SPLIT: Large Corpora Topic Modeling by Semantic Non-negative Matrix Factorization with Automatic Model Selection

Arxiv

0+阅读 · 2022年8月21日

Learning Speaker-specific Lip-to-Speech Generation

Arxiv

0+阅读 · 2022年8月20日

Aspect-based Sentiment Classification with Sequential Cross-modal Semantic Graph

Arxiv

0+阅读 · 2022年8月19日

Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition

Arxiv

0+阅读 · 2022年8月19日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

相关基金

ATP13A2基因亚型Ala746Thr和Thr12met突变与新疆维吾尔族早发型和家族型帕金森病临床的相关研究

国家自然科学基金

0+阅读 · 2014年12月31日

巴旦木脱青皮机理分析及实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

超材料表面实现透波隐身的机理及实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

功率SiC-JBS器件可靠性机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

听力基因prestin在回声定位哺乳动物中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

高灵敏度自适应数字多波束导航抗干扰技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

Prmt5在鱼类胚胎和生殖细胞发育中的作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

《物理》期刊

国家自然科学基金

1+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员