Deep learning approaches have demonstrated success in modeling analog audio effects. Nevertheless, challenges remain in modeling more complex effects that involve time-varying nonlinear elements, such as dynamic range compressors. Existing neural network approaches for modeling compression either ignore the device parameters, do not attain sufficient accuracy, or otherwise require large noncausal models prohibiting real-time operation. In this work, we propose a modification to temporal convolutional networks (TCNs) enabling greater efficiency without sacrificing performance. By utilizing very sparse convolutional kernels through rapidly growing dilations, our model attains a significant receptive field using fewer layers, reducing computation. Through a detailed evaluation we demonstrate our efficient and causal approach achieves state-of-the-art performance in modeling the analog LA-2A, is capable of real-time operation on CPU, and only requires 10 minutes of training data.
翻译:深层学习方法在模拟模拟音效方面证明取得了成功,然而,在模拟涉及动态射程压缩机等时间变化的非线性元素的更复杂效果方面仍然存在挑战。现有的模拟压缩神经网络方法要么忽略了设备参数,没有达到足够的准确性,或者需要大规模非因果模型禁止实时操作。在这项工作中,我们提议修改时变网络,以便在不牺牲性能的情况下提高效率。通过快速增长的推算利用非常稀少的革命内核,我们的模型利用较少的层缩小计算,取得了一个相当可观的可接受领域。我们通过详细评估,展示了我们高效和因果方法在模拟LA-2A模型中达到最新水平的性能,能够在CPU上实时操作,只需要10分钟的培训数据。