Hybrid meetings have become increasingly necessary during the post-COVID period and also brought new challenges for solving audio-related problems. In particular, the interplay between acoustic echo and acoustic howling in a hybrid meeting makes the joint suppression of them difficult. This paper proposes a deep learning approach to tackle this problem by formulating a recurrent feedback suppression process as an instantaneous speech separation task using the teacher-forced training strategy. Specifically, a self-attentive recurrent neural network is utilized to extract the target speech from microphone recordings with accessible and learned reference signals, thus suppressing acoustic echo and acoustic howling simultaneously. Different combinations of input signals and loss functions have been investigated for performance improvement. Experimental results demonstrate the effectiveness of the proposed method for suppressing echo and howling jointly in hybrid meetings.
翻译:混合会议在疫情后期变得越来越必要,也为解决音频相关问题带来了新的挑战。特别是,声学回声和声学啸叫在混合会议中的相互作用使得它们的联合抑制困难重重。本文提出了一种深度学习方法,通过将循环反馈抑制过程形式化为一个基于教师强制训练策略的瞬间语音分离任务,从可访问和学习到的参考信号中提取目标语音,同时抑制声学回声和声学啸叫。我们尝试了不同输入信号和损失函数的组合以提高性能。实验结果表明,所提出的方法有效地抑制了混合会议中的回声和啸叫。