We propose an audio effects processing framework that learns to emulate a target electric guitar tone from a recording. We train a deep neural network using an adversarial approach, with the goal of transforming the timbre of a guitar, into the timbre of another guitar after audio effects processing has been applied, for example, by a guitar amplifier. The model training requires no paired data, and the resulting model emulates the target timbre well whilst being capable of real-time processing on a modern personal computer. To verify our approach we present two experiments, one which carries out unpaired training using paired data, allowing us to monitor training via objective metrics, and another that uses fully unpaired data, corresponding to a realistic scenario where a user wants to emulate a guitar timbre only using audio data from a recording. Our listening test results confirm that the models are perceptually convincing.
翻译:我们提出了一种音频效果处理框架,该框架可以学会从录音中模拟出目标电吉他音色。我们使用对抗性方法训练了一个深度神经网络,旨在将吉他的音色转换成经过音频效果处理之后的另一个吉他的音色,例如,通过吉他音箱进行的处理。该模型训练无需成对数据,生成的模型能够很好地模拟目标音色,并且在现代个人计算机上能够实现实时处理。为了验证我们的方法,我们进行了两个实验:一个使用了成对数据进行了不成对训练,这样我们就能够通过客观指标监控训练进程,另一个使用了全不相关的数据,这对应了一个现实场景,即用户想要仅使用录音中的音频数据来模拟吉他音色。我们的听测试验结果证实了模型在感性上是有说服力的。