The majority of multichannel speech enhancement algorithms are two-step procedures that first apply a linear spatial filter, a so-called beamformer, and combine it with a single-channel approach for postprocessing. However, the serial concatenation of a linear spatial filter and a postfilter is not generally optimal in the minimum mean square error (MMSE) sense for noise distributions other than a Gaussian distribution. Rather, the MMSE optimal filter is a joint spatial and spectral nonlinear function. While estimating the parameters of such a filter with traditional methods is challenging, modern neural networks may provide an efficient way to learn the nonlinear function directly from data. To see if further research in this direction is worthwhile, in this work we examine the potential performance benefit of replacing the common two-step procedure with a joint spatial and spectral nonlinear filter. We analyze three different forms of non-Gaussianity: First, we evaluate on super-Gaussian noise with a high kurtosis. Second, we evaluate on inhomogeneous noise fields created by five interfering sources using two microphones, and third, we evaluate on real-world recordings from the CHiME3 database. In all scenarios, considerable improvements may be obtained. Most prominently, our analyses show that a nonlinear spatial filter uses the available spatial information more effectively than a linear spatial filter as it is capable of suppressing more than $D-1$ directional interfering sources with a $D$-dimensional microphone array without spatial adaptation.
翻译:多通道语音增强算法大多是两步程序,首先应用直线空间过滤器,即所谓的光束,然后将其与后处理的单一通道方法结合起来。然而,线线空间过滤器和后过滤器的序列相融合,对于除高斯分布以外的噪音分布而言,在最小平均平方差(MMSSE)感上一般不是最佳的。相反,MMSE最佳过滤器是一种空间和光谱非线性的联合非线性功能。在用传统方法来估计这种过滤器的参数具有挑战性的同时,现代神经网络可能提供一种有效的方法,直接从数据中学习非线性功能。但是,如果在这方面进行进一步的研究是值得的,我们检查用联合的空间和光谱非线性非线性过滤器取代共同两步程序的潜在性效益。我们分析了三种不同的非Gaussian形式:首先,我们用高额超Gausicial-1美元来评估超高额Gausicial-lickral-roomical roomical room fiel fiel fiel fistration fistration fithern dizere by froomizard by five five 5 offing exed exed surent exed surview surviews real real real real real real real real real real real malibal sal pal pal passal sal pal pal lifal sal sal sal sal pal pal pal passessional sal sal sal sal sal lactionsill lactional sal sal lactionsmal sessional sessional sal sal sal mal mal mal mal sal sal mal mal sal sal sal sal sal mal sal mal sal lactions mal mal sal mal messal mal mal mals mals mal mal mal mal mal mal mills mal mal mal malsalsalsalsalsalsalsaldals mal mal mal mals mals