Audio effects are an essential element in the context of music production, and therefore, modeling analog audio effects has been extensively researched for decades using system-identification methods, circuit simulation, and recently, deep learning. However, only few works tackled the reconstruction of signals that were processed using an audio effect unit. Given the recent advances in music source separation and automatic mixing, the removal of audio effects could facilitate an automatic remixing system. This paper focuses on removing distortion and clipping applied to guitar tracks for music production while presenting a comparative investigation of different deep neural network (DNN) architectures on this task. We achieve exceptionally good results in distortion removal using DNNs for effects that superimpose the clean signal to the distorted signal, while the task is more challenging if the clean signal is not superimposed. Nevertheless, in the latter case, the neural models under evaluation surpass one state-of-the-art declipping system in terms of source-to-distortion ratio, leading to better quality and faster inference.
翻译:音效是音乐制作方面的一个基本要素,因此,几十年来,利用系统识别方法、电路模拟和最近的深层学习,对模拟音效进行了广泛研究;然而,只有很少的作品涉及重建使用音效单位处理的信号。鉴于音乐源分离和自动混合的最新进展,删除音效可以促进自动混合系统。本文的重点是消除音乐制作吉他曲子上应用的扭曲和剪裁,同时对这项工作的不同深层神经网络结构进行比较调查。我们在利用DNN来去除将清洁信号添加到扭曲信号中的效果方面,取得了极好的结果。然而,如果清洁信号不被叠加,则任务就更具挑战性。然而,在后一种情况下,评价中的神经模型在源对扭曲比率方面超过了一个最先进的断层系统,从而导致质量的提高和更快的推断。