只听我说目标的语音提取能处理假警报吗? (Listen only to me! How well can target speech extraction handle false alarms?)

Target speech extraction (TSE) extracts the speech of a target speaker in a mixture given auxiliary clues characterizing the speaker, such as an enrollment utterance. TSE addresses thus the challenging problem of simultaneously performing separation and speaker identification. There has been much progress in extraction performance following the recent development of neural networks for speech enhancement and separation. Most studies have focused on processing mixtures where the target speaker is actively speaking. However, the target speaker is sometimes silent in practice, i.e., inactive speaker (IS). A typical TSE system will tend to output a signal in IS cases, causing false alarms. It is a severe problem for the practical deployment of TSE systems. This paper aims at understanding better how well TSE systems can handle IS cases. We consider two approaches to deal with IS, (1) training a system to directly output zero signals or (2) detecting IS with an extra speaker verification module. We perform an extensive experimental comparison of these schemes in terms of extraction performance and IS detection using the LibriMix dataset and reveal their pros and cons.

翻译：目标语音提取( TSE) 抽取目标演讲者在一种混合物中的演讲, 其辅助线索具有演讲者的特点, 例如招生语句。 TSE 处理同时进行分离和语音识别的棘手问题。最近发展了神经网络以加强语音和语音识别的神经网络之后,在提取性能方面取得了很大进展。大多数研究都集中在目标演讲者积极发言的处理混合物上。然而,目标演讲者有时在实践中保持沉默,即不活跃的扬声器( IS ) 。典型的 TSE 系统往往在IS 中发出信号, 造成虚假的警报。这是技术服务系统实际部署的一个严重问题。本文旨在更好地了解技术服务系统处理IS 案例的好坏。我们考虑两种处理方法:(1) 培训直接输出零信号的系统,或者(2) 用一个额外扬声器核查模块检测信息。我们用LibriMix数据集对这些方案进行广泛的实验性比较, 并展示其利Mix 数据集。

相关内容

TSE

关注 0

IEEE软件工程事务处理对定义明确的理论结果和对软件的构建、分析或管理有潜在影响的实证研究感兴趣。这些交易的范围从制定原则的机制到将这些原则应用到具体环境。具体的主题领域包括：a）开发和维护方法和模型，例如软件系统的规范、设计和实现的技术和原则，包括符号和过程模型；b）评估方法，例如软件测试和验证、可靠性模型、测试和诊断程序，用于错误控制的软件冗余和设计，以及过程和产品各个方面的测量和评估；c）软件项目管理，例如生产力因素、成本模型、进度和组织问题、标准；d）工具和环境，例如特定工具，集成工具环境，包括相关的体系结构、数据库、并行和分布式处理问题；e）系统问题，例如硬件-软件权衡；f）最新调查，提供对某一特定关注领域历史发展的综合和全面审查。官网地址：http://dblp.uni-trier.de/db/journals/tse/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日