In this work we present a new single-microphone speech dereverberation algorithm. First, a performance analysis is presented to interpret that algorithms focused on improving solely magnitude or phase are not good enough. Furthermore, we demonstrate that few objective measurements have high correlation with the clean magnitude while others with the clean phase. Consequently ,we propose a new architecture which consists of two sub-models, each of which is responsible for a different task. The first model estimates the clean magnitude given the noisy input. The enhanced magnitude together with the noisy-input phase are then used as inputs to the second model to estimate the real and imaginary portions of the dereverberated signal. A training scheme including pre-training and fine-tuning is presented in the paper. We evaluate our proposed approach using data from the REVERB challenge and compare our results to other methods. We demonstrate consistent improvements in all measures, which can be attributed to the improved estimates of both the magnitude and the phase.
翻译:在这项工作中,我们提出了一个新的单一麦克声讲话脱节算法。首先,提出了绩效分析,以解释侧重于改进纯粹规模或阶段的算法不够好。此外,我们证明,很少客观的测量方法与清洁规模有高度的关联,而另一些则与清洁阶段有高度的关联。因此,我们提出一个新的结构,由两个小模型组成,每个小模型都负责不同的任务。第一个模型估计了由于输入噪音而导致的清洁规模。随后,将放大的幅度与噪音输入阶段一起作为第二个模型的投入,用于估计扭曲信号的实际部分和想象部分。文件中提出了包括培训前和微调在内的培训计划。我们用REWERB挑战中的数据评估了我们建议的方法,并将我们的结果与其他方法进行比较。我们展示了所有措施的一致改进,这可以归因于对规模和阶段的改进估计。