Acoustic echo cancellation (AEC) in full-duplex communication systems eliminates acoustic feedback. However, nonlinear distortions induced by audio devices, background noise, reverberation, and double-talk reduce the efficiency of conventional AEC systems. Several hybrid AEC models were proposed to address this, which use deep learning models to suppress residual echo from standard adaptive filtering. This paper proposes deep learning-based joint AEC and beamforming model (JAECBF) building on our previous self-attentive recurrent neural network (RNN) beamformer. The proposed network consists of two modules: (i) multi-channel neural-AEC, and (ii) joint AEC-RNN beamformer with a double-talk detection (DTD) that computes time-frequency (T-F) beamforming weights. We train the proposed model in an end-to-end approach to eliminate background noise and echoes from far-end audio devices, which include nonlinear distortions. From experimental evaluations, we find the proposed network outperforms other multi-channel AEC and denoising systems in terms of speech recognition rate and overall speech quality.
翻译:超常通信系统中的声频取消(AEC)消除了声波反馈,然而,由音响装置、背景噪音、回响和双声调引发的非线性扭曲降低了常规AEC系统的效率。一些混合的AEC模型建议解决这一问题,这些模型使用深层学习模型来抑制标准适应过滤法的剩余回声。本文件建议以我们先前的自惯性经常性神经网络(RNNN)信号为基础,建立深层学习联合AEC和波束成型模型(JAECBF),拟议的网络由两个模块组成:(一) 多频道神经-AEC,和(二) 联合AEC-RNNE信号,配有计算时间频率(T-F)成型重量的双轨检测(DTD)。我们用端对端方法对拟议模型进行培训,以消除远端音频设备的背景噪音和回声波,其中包括非线性扭曲。我们从实验性评价中发现,拟议的网络比其他多频道AEC-NE和语音质量系统高。