We present Self-Remixing, a novel self-supervised speech separation method, which refines a pre-trained separation model in an unsupervised manner. The proposed method consists of a shuffler module and a solver module, and they grow together through separation and remixing processes. Specifically, the shuffler first separates observed mixtures and makes pseudo-mixtures by shuffling and remixing the separated signals. The solver then separates the pseudo-mixtures and remixes the separated signals back to the observed mixtures. The solver is trained using the observed mixtures as supervision, while the shuffler's weights are updated by taking the moving average with the solver's, generating the pseudo-mixtures with fewer distortions. Our experiments demonstrate that Self-Remixing gives better performance over existing remixing-based self-supervised methods with the same or less training costs under unsupervised setup. Self-Remixing also outperforms baselines in semi-supervised domain adaptation, showing effectiveness in multiple setups.
翻译:我们提出自我混和,这是一种新的自我监督的语音分离方法,它以不受监督的方式完善了预先训练的分离模型。提议的方法包括一个洗涤器模块和一个求解器模块,它们通过分离和重新混合过程共同生长。具体地说,洗涤器首先分离观察到的混合物,然后通过洗涤和重新混合分离的信号而制造假混合。解决问题者然后将假混和分离的信号重新混合回到被观察的混合物。溶解器用观察到的混合物作为监督,而洗涤器的重量则通过与求解器的移动平均值更新,产生假混和,但扭曲较少。我们的实验表明,自混和使现有的基于再混合的自我监督方法有更好的性能,在未经监督的设置下,使用同样或较少的培训费用。自我重新组合还用半超强域域适应的基线来排出自定义,在多个设置中显示效果。