Neural network based speech dereverberation has achieved promising results in recent studies. Nevertheless, many are focused on recovery of only the direct path sound and early reflections, which could be beneficial to speech perception, are discarded. The performance of a model trained to recover clean speech degrades when evaluated on early reverberation targets, and vice versa. This paper proposes a novel deep neural network based multichannel speech dereverberation algorithm, in which the dereverberation level is controllable. This is realized by adding a simple floating-point number as target controller of the model. Experiments are conducted using spatially distributed microphones, and the efficacy of the proposed algorithm is confirmed in various simulated conditions.
翻译:最近的研究表明,基于神经网络的言词变异已经取得了可喜的成果。 然而,许多研究只侧重于恢复直接路径声音和早期反射,而这种反射可能有利于言语感知,被抛弃。在对早期回响目标进行评估时,经过训练的清洁言语恢复模型的性能会退化,反之亦然。本文提出了一种新的基于神经网络的、基于多通道语音变异演算法,在这种演算法中,可以控制离异水平。这可以通过添加一个简单的浮点数作为模型的目标控制器来实现。实验是利用空间分布的麦克风进行的,而提议的算法的效力在各种模拟条件下得到确认。