This paper considers speech enhancement of signals picked up in one noisy environment which must be presented to a listener in another noisy environment. Recently, it has been shown that an optimal solution to this problem requires the consideration of the noise sources in both environments jointly. However, the existing optimal mutual information based method requires a complicated system model that includes natural speech variations, and relies on approximations and assumptions of the underlying signal distributions. In this paper, we propose to use a simpler signal model and optimize speech intelligibility based on the Approximated Speech Intelligibility Index (ASII). We derive a closed-form solution to the joint far- and near-end speech enhancement problem that is independent of the marginal distribution of signal coefficients, and that achieves similar performance to existing work. In addition, we do not need to model or optimize for natural speech variations.
翻译:本文探讨了在一个吵闹环境中接收到的信号的语音增强问题,必须在另一个吵闹环境中向听众介绍。最近,人们已经表明,这一问题的最佳解决办法需要同时考虑两种环境中的噪音源。然而,现有的基于信息的最佳方法需要复杂的系统模型,其中包括自然语音变异,并依赖于基本信号分布的近似和假设。在本文中,我们提议使用一个更简单的信号模型,并优化基于近似语音智能指数(ASII)的语音智能。我们对远端和近端联合语音增强问题提出一种封闭式解决方案,这一解决方案独立于信号系数的边际分布,并且实现与现有工作相似的绩效。此外,我们不需要为自然语音变异建模或优化。