Speech audio quality is subject to degradation caused by an acoustic environment and isotropic ambient and point noises. The environment can lead to decreased speech intelligibility and loss of focus and attention by the listener. Basic acoustic parameters that characterize the environment well are (i) signal-to-noise ratio (SNR), (ii) speech transmission index, (iii) reverberation time, (iv) clarity, and (v) direct-to-reverberant ratio. Except for the SNR, these parameters are usually derived from the Room Impulse Response (RIR) measurements; however, such measurements are often not available. This work presents a universal room acoustic estimator design based on convolutional recurrent neural networks that estimate the acoustic environment measurement blindly and jointly. Our results indicate that the proposed system is robust to non-stationary signal variations and outperforms current state-of-the-art methods.
翻译:音响质量会因声学环境以及异地环境与点噪音而退化,环境可能导致听力器的言语智能下降,失去注意力和注意力,作为环境特点的基本声学参数是:(一) 信号对噪音比率,(二) 语音传输指数,(三) 回声时间,(四) 清晰度,和(五) 直接对反动比率。除了国家空间系统外,这些参数通常来自室内隐性反应(RIR)的测量,然而,这种测量往往无法取得。这项工作展示了基于循环经常性神经网络、以盲目和联合估计声学环境测量结果的通用室声学估计器设计。我们的结果表明,拟议的系统对非静态信号变异性具有很强性,并且比目前最先进的方法更完美。