Differentiable Wavetable Synthesis (DWTS) is a technique for neural audio synthesis which learns a dictionary of one-period waveforms i.e. wavetables, through end-to-end training. We achieve high-fidelity audio synthesis with as little as 10 to 20 wavetables and demonstrate how a data-driven dictionary of waveforms opens up unprecedented one-shot learning paradigms on short audio clips. Notably, we show audio manipulations, such as high quality pitch-shifting, using only a few seconds of input audio. Lastly, we investigate performance gains from using learned wavetables for realtime and interactive audio synthesis.
翻译:不同的波形合成(DWTS)是一种神经声学合成技术,它通过端到端培训学习一期波形词典,即波形。我们通过端到端培训,实现了高度虚弱的音频合成,只有10到20个波状,并演示了数据驱动波形词典如何在短音剪短片上打开前所未有的一线学习模式。值得注意的是,我们展示了音频操纵,如高质量声调,只使用几秒钟输入音频。最后,我们调查了从实时和互动音频合成中学习的波状所产生的绩效收益。