We propose an audio effects processing framework that learns to emulate a target electric guitar tone from a recording. We train a deep neural network using an adversarial approach, with the goal of transforming the timbre of a guitar, into the timbre of another guitar after audio effects processing has been applied, for example, by a guitar amplifier. The model training requires no paired data, and the resulting model emulates the target timbre well whilst being capable of real-time processing on a modern personal computer. To verify our approach we present two experiments, one which carries out unpaired training using paired data, allowing us to monitor training via objective metrics, and another that uses fully unpaired data, corresponding to a realistic scenario where a user wants to emulate a guitar timbre only using audio data from a recording. Our listening test results confirm that the models are perceptually convincing.
翻译:我们提议了一个音效处理框架,以学习模仿录制中的目标电吉他音调。我们用对抗性方法训练一个深神经网络,目标是在音效处理(例如由吉他放大器)后,将吉他音调转换成另一个吉他音调。模型培训不需要配对数据,因此所产生的模型在能够实时处理现代个人计算机的同时,可以模仿目标音调。为了验证我们的方法,我们提出了两个实验,一个是利用配对数据进行无端培训,使我们能够通过客观指标监测培训情况,另一个是完全使用未标的数据的,这符合现实的情景,即用户只希望使用录音数据的音频数据模仿吉他音调。我们的听觉测试结果证实,这些模型在概念上是令人信服的。