Recent deep learning approaches have shown great improvement in audio source separation tasks. However, the vast majority of such work is focused on improving average separation performance, often neglecting to examine or control the distribution of the results. In this paper, we propose a simple, unified gradient reweighting scheme, with a lightweight modification to bias the learning process of a model and steer it towards a certain distribution of results. More specifically, we reweight the gradient updates of each batch, using a user-specified probability distribution. We apply this method to various source separation tasks, in order to shift the operating point of the models towards different objectives. We demonstrate different parameterizations of our unified reweighting scheme can be used towards addressing several real-world problems, such as unreliable separation estimates. Our framework enables the user to control a robustness trade-off between worst and average performance. Moreover, we experimentally show that our unified reweighting scheme can also be used in order to shift the focus of the model towards being more accurate for user-specified sound classes or even towards easier examples in order to enable faster convergence.
翻译:最近深入的学习方法显示,音频源分离任务大有改进,但绝大多数此类工作的重点是提高平均分离性能,往往忽视审查或控制结果的分布。在本文件中,我们提议了一个简单、统一的梯度再加权计划,对模型的学习过程进行轻量的修改,将其导向一定的分布结果。更具体地说,我们利用用户特定概率分布对每批的梯度更新进行重新加权。我们将这种方法应用于各种源的分离性能,以便将模型的运行点转向不同的目标。我们展示了我们统一重加权计划的不同参数,可用于解决几个现实世界的问题,例如不可靠的分离估计。我们的框架使用户能够控制最差和平均业绩之间的稳健的权衡。此外,我们实验性地表明,我们统一的再加权性计划也可以用来将模型的重点转向更精确的用户指定音频类,甚至更简单的例子,以便更快的趋同。