Music source separation with both paired mixed signals and source signals has obtained substantial progress over the years. However, this setting highly relies on large amounts of paired data. Source-only supervision decouples the process of learning a mapping from a mixture to particular sources into a two stage paradigm: source modeling and separation. Recent systems under source-only supervision either achieve good performance in synthetic toy experiments or limited performance in music separation task. In this paper, we leverage flow-based implicit generators to train music source priors and likelihood based objective to separate music mixtures. Experiments show that in singing voice and music separation tasks, our proposed systems achieve competitive results to one of the full supervision systems. We also demonstrate one variant of our proposed systems is capable of separating new source tracks effortlessly.
翻译:多年来,与混合信号和源信号配对的音乐源分离取得了很大进展,但是,这一设置高度依赖大量配对数据。只有来源的监督将从混合到特定来源的绘图过程分离为两个阶段的模式:源建模和分离。在源建模和分离方面,在源源分离的最近系统要么在合成玩具实验中取得良好表现,要么在音乐分离任务中表现有限。在本文中,我们利用流动的隐含生成器来培训音乐源的前奏和基于可能性的目标,以分离音乐混合物。实验表明,在歌唱和音乐分离的任务中,我们提议的系统在全面监督系统中取得竞争性结果。我们还展示了我们拟议系统的一个变式能够不费力地分离新的源轨迹。