Monaural speech enhancement (SE) provides a versatile and cost-effective approach to SE tasks by utilizing recordings from a single microphone. However, the monaural SE lags performance behind multi-channel SE as the monaural SE methods are unable to extract spatial information from one-channel recordings, which greatly limits their application scenarios. To address this issue, we inject spatial information into the monaural SE model and propose a knowledge distillation strategy to enable the monaural SE model to learn binaural speech features from the binaural SE model, which makes monaural SE model possible to reconstruct higher intelligibility and quality enhanced speeches under low signal-to-noise ratio (SNR) conditions. Extensive experiments show that our proposed monaural SE model by injecting spatial information via knowledge distillation achieves favorable performance against other monaural SE models with fewer parameters.
翻译:为解决这一问题,我们将空间信息注入了“月经”模型中,并提出了一个知识提炼战略,使“月经”模型能够学习二进制语言特征,使“月经”模型能够在低信号对噪音比率条件下重建更高的智能和质量增强的演讲。 广泛的实验表明,我们提议的“月经”模型通过知识提炼将空间信息注入到其他具有较少参数的“月经”模型中,取得了优异性能。