Neural audio synthesis is an actively researched topic, having yielded a wide range of techniques that leverages machine learning architectures. Google Magenta elaborated a novel approach called Differential Digital Signal Processing (DDSP) that incorporates deep neural networks with preconditioned digital signal processing techniques, reaching state-of-the-art results especially in timbre transfer applications. However, most of these techniques, including the DDSP, are generally not applicable in real-time constraints, making them ineligible in a musical workflow. In this paper, we present a real-time implementation of the DDSP library embedded in a virtual synthesizer as a plug-in that can be used in a Digital Audio Workstation. We focused on timbre transfer from learned representations of real instruments to arbitrary sound inputs as well as controlling these models by MIDI. Furthermore, we developed a GUI for intuitive high-level controls which can be used for post-processing and manipulating the parameters estimated by the neural network. We have conducted a user experience test with seven participants online. The results indicated that our users found the interface appealing, easy to understand, and worth exploring further. At the same time, we have identified issues in the timbre transfer quality, in some components we did not implement, and in installation and distribution of our plugin. The next iteration of our design will address these issues. Our real-time MATLAB and JUCE implementations are available at https://github.com/SMC704/juce-ddsp and https://github.com/SMC704/matlab-ddsp , respectively.
翻译:谷歌 Magenta 开发了一个名为“差异数字信号处理(DDSP)”的新颖方法,该方法包含深神经网络,并附有预设的数字信号处理技术,达到最先进的结果,特别是在平坦传输应用程序中。然而,这些技术中的大多数,包括DDSP,一般不适用于实时限制,使它们不符合音乐工作流程。在本文中,我们展示了DDSP图书馆的实时实施,该图书馆嵌入一个虚拟合成器,作为插件,可用于数字音效工作站。我们侧重于从所学的对真实工具的展示到任意声音输入的平坦转换,以及由MIDI控制这些模型。此外,我们开发了一个可用于后处理和调控由神经网络估计的参数的直观性高层次控制图形。我们已经在网上对7名参与者进行了用户经验测试。结果显示,我们的用户发现界面有吸引力,容易理解,值得进一步探索。与此同时,我们还在实际安装过程中,我们找到了这些系统的质量配置问题。