带变化式自动编码和循环-兼容反对称网络的宽度传输 (Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks)

This research project investigates the application of deep learning to timbre transfer, where the timbre of a source audio can be converted to the timbre of a target audio with minimal loss in quality. The adopted approach combines Variational Autoencoders with Generative Adversarial Networks to construct meaningful representations of the source audio and produce realistic generations of the target audio and is applied to the Flickr 8k Audio dataset for transferring the vocal timbre between speakers and the URMP dataset for transferring the musical timbre between instruments. Furthermore, variations of the adopted approach are trained, and generalised performance is compared using the metrics SSIM (Structural Similarity Index) and FAD (Frech\'et Audio Distance). It was found that a many-to-many approach supersedes a one-to-one approach in terms of reconstructive capabilities, and that the adoption of a basic over a bottleneck residual block design is more suitable for enriching content information about a latent space. It was also found that the decision on whether cyclic loss takes on a variational autoencoder or vanilla autoencoder approach does not have a significant impact on reconstructive and adversarial translation aspects of the model.

翻译：这个研究项目调查了深层次学习对音质传输的应用, 将源音频的触角转换成目标音频的触角, 质量损失最小。采用的方法将变式自动电解器与基因反对流网络结合起来, 以构建源音频有意义的表达方式, 产生现实的一代目标音频, 并应用Flickr 8k 音频数据集, 以在音频和URMP数据集之间传输音频调音频阵列, 以转移仪器之间的音频阵列。此外, 对采用的方法的变异进行了培训, 并且将一般性能比作使用 SSIM( 结构相似指数) 和 FAD( Frech\'et 音频距离) 。人们发现, 从重建能力上看, 多种到多种方法取代了一对一的方法, 并且对瓶端残余区设计采用基本方法更适合丰富关于隐性空间的内容信息。另外, 发现, 有关自行车损失是否在变式自动转换模型或Vanilla 自动coder 方法的转化方式不会产生显著的影响。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

AAAI2021 | 图神经网络的异质图结构学习，Heterogeneous Graph Structure Learning for Graph Neural Networks

专知会员服务

92+阅读 · 2021年1月20日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日