Deep learning approaches have demonstrated success in the task of modeling analog audio effects such as distortion and overdrive. Nevertheless, challenges remain in modeling more complex effects, such as dynamic range compressors, along with their variable parameters. Previous methods are computationally complex, and noncausal, prohibiting real-time operation, which is critical for use in audio production contexts. They additionally utilize large training datasets, which are time-intensive to generate. In this work, we demonstrate that shallower temporal convolutional networks (TCNs) that exploit very large dilation factors for significant receptive field can achieve state-of-the-art performance, while remaining efficient. Not only are these models found to be perceptually similar to the original effect, they achieve a 4x speedup, enabling real-time operation on CPU, and can be trained using only 1% of the data from previous methods.
翻译:深层学习方法在模拟模拟模拟音效(如扭曲和过度驱动)的任务中证明取得了成功,然而,在模拟更复杂的效果(如动态射程压缩机)及其变量参数方面仍然存在挑战。 以往的方法在计算上是复杂的,并且是非因果操作,禁止实时操作,这对于音频制作至关重要。它们还使用大型培训数据集,这些数据集需要时间密集才能生成。 在这项工作中,我们证明,利用大量接受领域非常巨大的变相因素的浅时变网络(TCNs)可以在保持效率的同时实现最先进的性能。这些模型不仅在概念上与原始效果相似,而且还能实现4x加速,使CPU能够实时操作,并且只能用先前方法中的数据的1%进行培训。