DrumGAN VST: 使用自动编码生成反versarial 网络进行 Drum 声音分析/合成的 DrumGAN VST 插件 (DrumGAN VST: A Plugin for Drum Sound Analysis/Synthesis With Autoencoding Generative Adversarial Networks)

from arxiv, 7 pages, 2 figures, 3 tables, ICML2022 Machine Learning for Audio Synthesis (MLAS) Workshop, for sound examples visit https://cslmusicteam.sony.fr/drumgan-vst/

In contemporary popular music production, drum sound design is commonly performed by cumbersome browsing and processing of pre-recorded samples in sound libraries. One can also use specialized synthesis hardware, typically controlled through low-level, musically meaningless parameters. Today, the field of Deep Learning offers methods to control the synthesis process via learned high-level features and allows generating a wide variety of sounds. In this paper, we present DrumGAN VST, a plugin for synthesizing drum sounds using a Generative Adversarial Network. DrumGAN VST operates on 44.1 kHz sample-rate audio, offers independent and continuous instrument class controls, and features an encoding neural network that maps sounds into the GAN's latent space, enabling resynthesis and manipulation of pre-existing drum sounds. We provide numerous sound examples and a demo of the proposed VST plugin.

翻译：在当代流行音乐制作中,鼓声设计通常通过在音响图书馆进行繁琐的浏览和处理预录的样品来进行,还可以使用专门合成硬件,通常通过低水平的、音乐上毫无意义的参数加以控制。今天,深学习领域通过高层次的学习功能提供了控制合成过程的方法,并能够产生各种各样的声音。在本文中,我们介绍了DrumGAN VST,一个利用创性反转网络合成鼓声的插件。DrumGAN VST用44.1千赫兹样率的音频操作,提供独立和连续的仪器级控制,以及一个编码神经网络,将声音映射到GAN的潜伏空间,使原有的鼓声能够重新合成和操纵。我们提供了许多声音实例,并演示了拟议的VST插件。