We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone. The objective of the system is to generate interactively controllable sound models given (a) a range of sounds the model should be able to synthesize, and (b) a specification of the parametric controls for navigating that space of sounds. The range of sounds is defined by a dataset provided by the designer, while the means of navigation is defined by a combination of data labels and the selection of a sub-manifold from the latent space learned by the GAN. Our proposed system takes advantage of the rich latent space of a GAN that consists of sounds that fill out the spaces ''between" real data-like sounds. This augmented data from the GAN is then used to train an RNN for its ability to respond immediately and continuously to parameter changes and to generate audio over unlimited periods of time. Furthermore, we develop a self-organizing map technique for ``smoothing" the latent space of GAN that results in perceptually smooth interpolation between audio timbres. We validate this process through user studies. The system contributes advances to the state of the art for generative sound model design that include system configuration and components for improving interpolation and the expansion of audio modeling capabilities beyond musical pitch and percussive instrument sounds into the more complex space of audio textures.
翻译:我们引入了一个新的数据驱动音频声音模型设计系统,围绕两种不同的神经网络结构,即General Adversarial 网络(GAN)和一个经常性神经网络(RNN),它利用每个系统的独特性,实现两者都无法单独解决的系统目标。该系统的目标是生成一个互动控制的音频模型(a) 该模型应能够合成一系列声音,以及(b) 用于导航声音空间的参数控制规格。声音的范围由设计师提供的数据集界定,而导航手段则通过数据标签和从GAN所学的潜在空间选择一个子磁带来界定。我们提议的系统利用了GAN的丰富潜在空间空间空间空间空间空间,其中包括在空间“之间”真实的数据类声音之间的声音。随后,GAN的增强数据被用于培训 RNN,使其能立即和持续应对参数变化,并在不定期的时间里生成音频。此外,我们开发了一种超越数据标签标签标签标签标签的导航工具,在系统内部进行系统内部智能的系统设计。我们开发了一种稳定的系统,通过系统来改进系统内部空间系统的进展。