X-SepFormer:终端对终端的议长抽取网络,明确优化议长融合</s> (X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion)

Target speech extraction (TSE) systems are designed to extract target speech from a multi-talker mixture. The popular training objective for most prior TSE networks is to enhance reconstruction performance of extracted speech waveform. However, it has been reported that a TSE system delivers high reconstruction performance may still suffer low-quality experience problems in practice. One such experience problem is wrong speaker extraction (called speaker confusion, SC), which leads to strong negative experience and hampers effective conversations. To mitigate the imperative SC issue, we reformulate the training objective and propose two novel loss schemes that explore the metric of reconstruction improvement performance defined at small chunk-level and leverage the metric associated distribution information. Both loss schemes aim to encourage a TSE network to pay attention to those SC chunks based on the said distribution information. On this basis, we present X-SepFormer, an end-to-end TSE model with proposed loss schemes and a backbone of SepFormer. Experimental results on the benchmark WSJ0-2mix dataset validate the effectiveness of our proposals, showing consistent improvements on SC errors (by 14.8% relative). Moreover, with SI-SDRi of 19.4 dB and PESQ of 3.81, our best system significantly outperforms the current SOTA systems and offers the top TSE results reported till date on the WSJ0-2mix.

翻译：目标语音提取系统(TSE)旨在从多听话者混合体中提取目标演讲。大多数前TES网络的普及培训目标是提高提取的语音波形的重建性能。然而,据报告,TSE系统提供高质量的重建性能,实际上可能仍然遇到低质量的经验问题。这种经验问题之一是错误的语音提取(所谓的扬声器混乱,SC),导致强烈的负面经验并妨碍有效对话。为减轻在SC问题上的紧迫问题,我们重新制定培训目标,并提出两个新的损失计划,探索小块级界定的重建改善绩效衡量标准,并利用与指标相关的分发信息。两种损失计划都旨在鼓励TSE网络根据上述分发信息关注这些SC块。在此基础上,我们提出X-SEFormer,即端到端的TSESE模型, 与拟议的损失计划相联,以及SWJ0-2mix数据集的实验结果证实了我们提案的有效性,显示在SC级小块级层面的改进(相对14.8%)。此外,据SI-SDRE网络报告,19.4 dMA系统的最新日期为SISMA。</s>

相关内容

TSE

关注 0

IEEE软件工程事务处理对定义明确的理论结果和对软件的构建、分析或管理有潜在影响的实证研究感兴趣。这些交易的范围从制定原则的机制到将这些原则应用到具体环境。具体的主题领域包括：a）开发和维护方法和模型，例如软件系统的规范、设计和实现的技术和原则，包括符号和过程模型；b）评估方法，例如软件测试和验证、可靠性模型、测试和诊断程序，用于错误控制的软件冗余和设计，以及过程和产品各个方面的测量和评估；c）软件项目管理，例如生产力因素、成本模型、进度和组织问题、标准；d）工具和环境，例如特定工具，集成工具环境，包括相关的体系结构、数据库、并行和分布式处理问题；e）系统问题，例如硬件-软件权衡；f）最新调查，提供对某一特定关注领域历史发展的综合和全面审查。官网地址：http://dblp.uni-trier.de/db/journals/tse/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日