Removing reverb from reverberant music is a necessary technique to clean up audio for downstream music manipulations. Reverberation of music contains two categories, natural reverb, and artificial reverb. Artificial reverb has a wider diversity than natural reverb due to its various parameter setups and reverberation types. However, recent supervised dereverberation methods may fail because they rely on sufficiently diverse and numerous pairs of reverberant observations and retrieved data for training in order to be generalizable to unseen observations during inference. To resolve these problems, we propose an unsupervised method that can remove a general kind of artificial reverb for music without requiring pairs of data for training. The proposed method is based on diffusion models, where it initializes the unknown reverberation operator with a conventional signal processing technique and simultaneously refines the estimate with the help of diffusion models. We show through objective and perceptual evaluations that our method outperforms the current leading vocal dereverberation benchmarks.
翻译:清除变动音乐的反动是清理下游音乐操纵所需的一种必要技术。 音乐的反动包含两种类别, 自然反动和人工反动。 人工反动由于其参数设置和反动类型, 其多样性比自然反动要大。 然而, 最近受到监督的脱动方法可能失败, 因为它们依赖足够多样和众多的反动观测和检索的数据来进行培训, 以便在推理过程中可以普遍地进行看不见的观测。 为了解决这些问题, 我们建议了一种不受监督的方法, 它可以消除一种一般的人工音乐反动, 而不需要一对数据来进行培训。 提议的方法以扩散模型为基础, 将未知的反动操作器初始化为常规的信号处理技术, 并同时利用扩散模型来改进估计。 我们通过客观和感知性评估来显示, 我们的方法超过了当前主要的声动基准 。