神经目标演讲摘录:概览 (Neural Target Speech Extraction: An Overview)

Humans can listen to a target speaker even in challenging acoustic conditions that have noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail-party effect. For decades, researchers have focused on approaching the listening ability of humans. One critical issue is handling interfering speakers because the target and non-target speech signals share similar characteristics, complicating their discrimination. Target speech/speaker extraction (TSE) isolates the speech signal of a target speaker from a mixture of several speakers with or without noises and reverberations using clues that identify the speaker in the mixture. Such clues might be a spatial clue indicating the direction of the target speaker, a video of the speaker's lips, or a pre-recorded enrollment utterance from which their voice characteristics can be derived. TSE is an emerging field of research that has received increased attention in recent years because it offers a practical approach to the cocktail-party problem and involves such aspects of signal processing as audio, visual, array processing, and deep learning. This paper focuses on recent neural-based approaches and presents an in-depth overview of TSE. We guide readers through the different major approaches, emphasizing the similarities among frameworks and discussing potential future directions.

翻译：人类甚至可以在有噪音、反响和干扰音员的具有挑战性的音响条件下倾听目标演讲者的声音,这种现象被称为鸡尾酒会效应。几十年来,研究人员一直侧重于接近人类的听力能力。一个关键问题是处理干扰演讲者,因为目标和非目标的语音信号具有相似的特点,使其歧视复杂化。目标演讲/声音提取(TSE)将目标演讲者的语音信号与若干发言者使用能识别混合音频、声音和反响的线索或没有声音和反响的混合体隔开来。这些线索可能是显示目标演讲者方向的空间线索、发言者嘴唇的视频或预先录制的录制的录入,可以从中得出其声音特征。TSE是一个新兴的研究领域,近年来由于它提供了解决鸡尾派对问题的实用方法,涉及音频、视觉、阵列处理和深层次学习等信号处理的方方面。本文侧重于最近的神经基方法,并介绍了TEE的深度概览。我们指导读者通过不同的主要方法,强调框架之间的相似性和潜力。

相关内容

TSE

关注 0

IEEE软件工程事务处理对定义明确的理论结果和对软件的构建、分析或管理有潜在影响的实证研究感兴趣。这些交易的范围从制定原则的机制到将这些原则应用到具体环境。具体的主题领域包括：a）开发和维护方法和模型，例如软件系统的规范、设计和实现的技术和原则，包括符号和过程模型；b）评估方法，例如软件测试和验证、可靠性模型、测试和诊断程序，用于错误控制的软件冗余和设计，以及过程和产品各个方面的测量和评估；c）软件项目管理，例如生产力因素、成本模型、进度和组织问题、标准；d）工具和环境，例如特定工具，集成工具环境，包括相关的体系结构、数据库、并行和分布式处理问题；e）系统问题，例如硬件-软件权衡；f）最新调查，提供对某一特定关注领域历史发展的综合和全面审查。官网地址：http://dblp.uni-trier.de/db/journals/tse/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日