利用Sprass LDA变换的议长嵌入式改进目标议长抽取 (Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings)

As a practical alternative of speech separation, target speaker extraction (TSE) aims to extract the speech from the desired speaker using additional speaker cue extracted from the speaker. Its main challenge lies in how to properly extract and leverage the speaker cue to benefit the extracted speech quality. The cue extraction method adopted in majority existing TSE studies is to directly utilize discriminative speaker embedding, which is extracted from the pre-trained models for speaker verification. Although the high speaker discriminability is a most desirable property for speaker verification task, we argue that it may be too sophisticated for TSE. In this study, we propose that a simplified speaker cue with clear class separability might be preferred for TSE. To verify our proposal, we introduce several forms of speaker cues, including naive speaker embedding (such as, x-vector and xi-vector) and new speaker embeddings produced from sparse LDA-transform. Corresponding TSE models are built by integrating these speaker cues with SepFormer (one SOTA speech separation model). Performances of these TSE models are examined on the benchmark WSJ0-2mix dataset. Experimental results validate the effectiveness and generalizability of our proposal, showing up to 9.9% relative improvement in SI-SDRi. Moreover, with SI-SDRi of 19.4 dB and PESQ of 3.78, our best TSE system significantly outperforms the current SOTA systems and offers the top TSE results reported till date on the WSJ0-2mix.

翻译：作为语言隔离的一个实际替代办法,目标演讲者提取(TSE)的目的是利用从发言者中提取的额外发言者提示,从所希望的发言者中提取演讲词,其主要挑战在于如何正确提取和调用发言者提示,以有利于发言质量。多数现有 TSE 研究中采用的提示提取方法是直接使用歧视演讲者嵌入,这是从预先培训的演讲者校验模式中提取的。虽然高演讲者差异性是让发言者核查任务最可取的属性,但我们认为,对于TSE来说,它可能过于复杂。我们建议,TSE可能更喜欢使用一个具有明确等级分隔性的简化演讲者提示。为了验证我们的提议,我们采用了几种形式的演讲者提示,包括天真的嵌入(例如,X-Vexctor和xx-civector)和新演讲者嵌入式嵌入,通过将这些演讲者提示与Seporformer(一个SOTA发言分解模式)相结合。我们用基准WSJ0-2-MQS-QS-QSA的成绩, 实验性S-TIS-TIS-S-TIS-S-Syalalal Apprestal Applial 和Syal Appresentalalal resental Applishalalalalal 和SU SI-SIS Applipal Stalalal ressalsalal 和SIS Appalal ress 的SIMFI restialal 。

相关内容

TSE

关注 0

IEEE软件工程事务处理对定义明确的理论结果和对软件的构建、分析或管理有潜在影响的实证研究感兴趣。这些交易的范围从制定原则的机制到将这些原则应用到具体环境。具体的主题领域包括：a）开发和维护方法和模型，例如软件系统的规范、设计和实现的技术和原则，包括符号和过程模型；b）评估方法，例如软件测试和验证、可靠性模型、测试和诊断程序，用于错误控制的软件冗余和设计，以及过程和产品各个方面的测量和评估；c）软件项目管理，例如生产力因素、成本模型、进度和组织问题、标准；d）工具和环境，例如特定工具，集成工具环境，包括相关的体系结构、数据库、并行和分布式处理问题；e）系统问题，例如硬件-软件权衡；f）最新调查，提供对某一特定关注领域历史发展的综合和全面审查。官网地址：http://dblp.uni-trier.de/db/journals/tse/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

专知会员服务

17+阅读 · 2020年5月19日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

103+阅读 · 2020年4月25日