In this paper, we formulate acoustic howling suppression (AHS) as a supervised learning problem and propose a deep learning approach, called Deep AHS, to address it. Deep AHS is trained in a teacher forcing way which converts the recurrent howling suppression process into an instantaneous speech separation process to simplify the problem and accelerate the model training. The proposed method utilizes properly designed features and trains an attention based recurrent neural network (RNN) to extract the target signal from the microphone recording, thus attenuating the playback signal that may lead to howling. Different training strategies are investigated and a streaming inference method implemented in a recurrent mode used to evaluate the performance of the proposed method for real-time howling suppression. Deep AHS avoids howling detection and intrinsically prohibits howling from happening, allowing for more flexibility in the design of audio systems. Experimental results show the effectiveness of the proposed method for howling suppression under different scenarios.
翻译:在本文中,我们将声击抑制(AHS)作为一种有监督的学习问题,并提出一种叫做深AHS的深学习方法来解决该问题。深AHS是用教师强迫方式培训的,这种方式将经常性的嚎叫抑制过程转换成即时语音分离过程,以简化问题并加速模式培训。拟议方法使用设计得当的特征,并训练一个关注的经常性神经网络(RNN),从麦克风录音中提取目标信号,从而降低可能导致嚎叫的回放信号。对不同的培训战略进行了调查,并用一种经常性模式采用了流推论方法,用于评价实时嚎叫抑制方法的性能。深AHS避免了嚎叫探测,并本质上禁止了声音的发生,从而允许在设计音响系统时有更大的灵活性。实验结果显示,在不同的情景下,拟议中的嚎叫方法的有效性。