This paper introduces a new method referred to as KISS-GEV (for Keep It Super Simple Generalized eigenvalue) beamforming. While GEV beamforming usually relies on deep neural network for estimating target and noise time-frequency masks, this method uses a signal processing approach based on the direction of arrival (DoA) of the target. This considerably reduces the amount of computations involved at test time, and works for speech enhancement in unseen conditions as there is no need to train a neural network with noisy speech. The proposed method can also be used to separate speech from a mixture, provided the speech sources come from different directions. Results also show that the proposed method uses the same minimal DoA assumption as Delay-and-Sum beamforming, yet outperforms this traditional approach.
翻译:本文引入了一种称为 KISS- GEV (“ 保存超简单一般化电子值 ” 的新方法。 虽然 GEV 光谱化通常依靠深神经网络来估计目标值和噪音时频遮罩, 但这种方法使用基于目标到达方向的信号处理方法。 这大大减少了测试时间的计算量, 并努力在看不见条件下增强语音, 因为不需要用吵闹的言词来训练神经网络。 提议的方法也可以用来将语言与混合物分开, 只要语音来源来自不同方向。 结果还表明,拟议方法使用的DoA假设与“ 延时和萨姆波束化”相同, 但却超越了这一传统方法。