Frequency-domain beamformers have been successful in a wide range of multi-channel neural separation systems in the past years. However, the operations in conventional frequency-domain beamformers are typically independently-defined and complex-valued, which result in two drawbacks: the former does not fully utilize the advantage of end-to-end optimization, and the latter may introduce numerical instability during the training phase. Motivated by the recent success in end-to-end neural separation systems, in this paper we propose time-domain real-valued generalized Wiener filter (TD-GWF), a linear filter defined on a 2-D learnable real-valued signal transform. TD-GWF splits the transformed representation into groups and performs an minimum mean-square error (MMSE) estimation on all available channels on each of the groups. We show how TD-GWF can be connected to conventional filter-and-sum beamformers when certain signal transform and the number of groups are specified. Moreover, given the recent success in the sequential neural beamforming frameworks, we show how TD-GWF can be applied in such frameworks to perform iterative beamforming and separation to obtain an overall performance gain. Comprehensive experiment results show that TD-GWF performs consistently better than conventional frequency-domain beamformers in the sequential neural beamforming pipeline with various neural network architectures, microphone array scenarios, and task configurations.
翻译:过去几年来,常规频域域域域域仪的操作在一系列多信道神经分离系统中取得了成功,但是,常规频域域域域仪的操作通常是独立定义和复杂估价的,结果有两个缺点:前者没有充分利用端到端优化的优势,后者在培训阶段可能会带来数字不稳定。受最近端到端神经分离系统的成功推动,本文提出时间-域域名实际估价通用韦纳过滤器(TD-GWF),这是在2-D可学习的直线过滤器上定义的直线过滤器,在2-D可学习的实值信号变换。TD-GWF将转型的代表比例分成为一组,对每个组的所有现有渠道进行最低平均值差(MMSE)估计。我们展示了在特定信号变换和群体数目明确时,TD-GFF可如何与常规过滤器和组合相连接。此外,鉴于最近连续的内置内线框架的成功,我们展示了DGFFFD如何在全局性周期内进行更好的运行。