Traditional source separation approaches train deep neural network models end-to-end with all the data available at once by minimizing the empirical risk on the whole training set. On the inference side, after training the model, the user fetches a static computation graph and runs the full model on some specified observed mixture signal to get the estimated source signals. Additionally, many of those models consist of several basic processing blocks which are applied sequentially. We argue that we can significantly increase resource efficiency during both training and inference stages by reformulating a model's training and inference procedures as iterative mappings of latent signal representations. First, we can apply the same processing block more than once on its output to refine the input signal and consequently improve parameter efficiency. During training, we can follow a block-wise procedure which enables a reduction on memory requirements. Thus, one can train a very complicated network structure using significantly less computation compared to end-to-end training. During inference, we can dynamically adjust how many processing blocks and iterations of a specific block an input signal needs using a gating module.
翻译:传统源分离方法通过将整个培训组的经验风险降到最低程度,用所有现有数据对深神经网络模型进行端对端培训。在推论方面,在模型培训后,用户取取一个静态计算图,并在某些特定观测到的混合物信号上运行完整模型,以获得估计源信号。此外,许多这些模型由几个基本处理区块组成,这些加工区块是按顺序应用的。我们争辩说,在培训和推论阶段,我们可以通过重订模型的培训和推论程序作为潜在信号示意图的迭接图,大大提高资源效率。首先,我们可以在其输出上不止一次地应用相同的处理区块来改进输入信号,从而提高参数效率。在培训期间,我们可以遵循一个条块式程序,减少记忆要求。因此,可以用比终端到终端培训少得多的计算,来训练一个非常复杂的网络结构。在推论过程中,我们可以动态地调整一个特定区块输入信号需要使用模模模。