Most of the recent neural source separation systems rely on a masking-based pipeline where a set of multiplicative masks are estimated from and applied to a signal representation of the input mixture. The estimation of such masks, in almost all network architectures, is done by a single layer followed by an optional nonlinear activation function. However, recent literatures have investigated the use of a deep mask estimation module and observed performance improvement compared to a shallow mask estimation module. In this paper, we analyze the role of such deeper mask estimation module by connecting it to a recently proposed unsupervised source separation method, and empirically show that the deep mask estimation module is an efficient approximation of the so-called overseparation-grouping paradigm with the conventional shallow mask estimation layers.
翻译:最近大部分神经源分离系统依靠一个基于掩膜的管道,根据这种管道对输入混合物的信号表示进行一套多复制面罩的估算,并将其应用于输入混合物的信号表示。几乎所有网络结构中,这种遮罩的估算由一个单层进行,然后有一个非线性引爆功能。然而,最近的一些文献调查了深掩膜估计模块的使用,并观察到与浅掩膜估计模块相比性能的改进。在本文件中,我们通过将其与最近提出的一种未经监督的来源分离方法连接起来,分析了这种更深遮罩估计模块的作用,从经验上表明,深遮罩估计模块是所谓的过度分离组合模式与传统的浅遮罩估计层的有效近似。