Footsteps are among the most ubiquitous sound effects in multimedia applications. There is substantial research into understanding the acoustic features and developing synthesis models for footstep sound effects. In this paper, we present a first attempt at adopting neural synthesis for this task. We implemented two GAN-based architectures and compared the results with real recordings as well as six traditional sound synthesis methods. Our architectures reached realism scores as high as recorded samples, showing encouraging results for the task at hand.
翻译:脚步是多媒体应用中最普遍的声学效应之一,对了解声学特征和开发助步声效应综合模型进行了大量研究。在本文件中,我们首次尝试为此任务采用神经合成。我们实施了两个基于GAN的架构,并将结果与真实记录和六种传统声学合成方法进行了对比。我们的架构达到了与记录样本一样高的现实主义分数,显示了当前任务令人鼓舞的结果。