The intelligibility and quality of speech from a mobile phone or public announcement system are often affected by background noise in the listening environment. By pre-processing the speech signal it is possible to improve the speech intelligibility and quality -- this is known as near-end listening enhancement (NLE). Although, existing NLE techniques are able to greatly increase intelligibility in harsh noise environments, in favorable noise conditions the intelligibility of speech reaches a ceiling where it cannot be further enhanced. Actually, the focus of existing methods solely on improving the intelligibility causes unnecessary processing of the speech signal and leads to speech distortions and quality degradations. In this paper, we provide a new rationale for NLE, where the target speech is minimally processed in terms of a processing penalty, provided that a certain performance constraint, e.g., intelligibility, is satisfied. We present a closed-form solution for the case where the performance criterion is an intelligibility estimator based on the approximated speech intelligibility index and the processing penalty is the mean-square error between the processed and the clean speech. This produces an NLE method that adapts to changing noise conditions via a simple gain rule by limiting the processing to the minimum necessary to achieve a desired intelligibility, while at the same time focusing on quality in favorable noise situations by minimizing the amount of speech distortions. Through simulation studies, we show the proposed method attains speech quality on par or better than existing methods in both objective measurements and subjective listening tests, whilst still sustaining objective speech intelligibility performance on par with existing methods.
翻译:手机或公共公告系统的言语的灵敏度和质量往往受到听力环境中背景噪音的影响。通过预先处理语音信号,有可能改进语言智能度和质量 -- -- 这被称为近端听力增强(NLE ) 。虽然现有的NLE技术能够在严酷的噪音环境中极大地提高智能度,在有利的噪音条件下,言语感知度达到一个无法进一步提高的上限。事实上,现有方法仅仅侧重于改进感知度,导致对语音信号的不必要处理,导致言语扭曲和质量退化。在本文中,我们为NLE提供了一个新的理由,在处理处罚方面,目标演讲的灵敏度得到最低限度的处理,条件是某些性能限制,例如,感知性能,是令人满意的。我们为这种发言的灵敏度标准提供了一个封闭式解决方案,其效果无法进一步提高。基于感知性言语感指数和处理处罚是经处理的言语信号和言语质量扭曲之间的中度差差差差差差。我们为NLE提供一个新的理由,在处理处罚方面,以最起码的性质量测试方式,同时通过简单、最客观的处理方法,以降低目前的标准,以降低的言质性能,同时调整现有方法,通过简化的言语质处理方式,以降低的言质,通过简化的言语态,以降低的音质量,以改变现有的方法,通过平压方法,在平压。