We propose GANStrument, a generative adversarial model for instrument sound synthesis. Given a one-shot sound as input, it is able to generate pitched instrument sounds that reflect the timbre of the input within an interactive time. By exploiting instance conditioning, GANStrument achieves better fidelity and diversity of synthesized sounds and generalization ability to various inputs. In addition, we introduce an adversarial training scheme for a pitch-invariant feature extractor that significantly improves the pitch accuracy and timbre consistency. Experimental results show that GANStrument outperforms strong baselines that do not use instance conditioning in terms of generation quality and input editability. Qualitative examples are available online.
翻译:我们提议GANSTrument, 这是一种用于仪器声音合成的基因对抗模型。 以一发音作为输入, 它能够在互动的时间里生成反映输入的细边的投影仪器声音。 通过利用实例调节, GANSTrution 实现了对各种输入的更忠实和多样化的合成声音和概括化能力。 此外, 我们为投球变异特征提取器引入了一种对抗性培训计划, 大大提高了投球精度和最小一致性。 实验结果表明, GANSTrution 超越了强大的基线, 而没有在生成质量和输入可编辑性方面使用实例调节。 在线上可以找到一些定性的例子。