VIP内容

虽然变分自编码器(VAEs)代表了一个广泛的有影响力的深度生成模型,但潜在的能量函数的许多方面仍然知之甚少。特别是,一般认为高斯编码器/解码器的假设降低了VAEs生成真实样本的有效性。在这方面,我们严格地分析VAE目标,区分哪些情况下这个信念是真实的,哪些情况下不是真实的。然后我们利用相应的见解来开发一个简单的VAE增强,不需要额外的hyperparameters或敏感的调优。在数量上,这个提议产生了清晰的样本和稳定的FID分数,这些分数实际上与各种GAN模型相竞争,同时保留了原始VAE架构的理想属性。这项工作的一个简短版本将出现在ICLR 2019年会议记录(Dai和Wipf, 2019)上。我们模型的代码在这个https URL TwoStageVAE中可用。

成为VIP会员查看完整内容
0
22

最新内容

Supervised speech enhancement relies on parallel databases of degraded speech signals and their clean reference signals during training. This setting prohibits the use of real-world degraded speech data that may better represent the scenarios where such systems are used. In this paper, we explore methods that enable supervised speech enhancement systems to train on real-world degraded speech data. Specifically, we propose a semi-supervised approach for speech enhancement in which we first train a modified vector-quantized variational autoencoder that solves a source separation task. We then use this trained autoencoder to further train an enhancement network using real-world noisy speech data by computing a triplet-based unsupervised loss function. Experiments show promising results for incorporating real-world data in training speech enhancement systems.

0
0
下载
预览

最新论文

Supervised speech enhancement relies on parallel databases of degraded speech signals and their clean reference signals during training. This setting prohibits the use of real-world degraded speech data that may better represent the scenarios where such systems are used. In this paper, we explore methods that enable supervised speech enhancement systems to train on real-world degraded speech data. Specifically, we propose a semi-supervised approach for speech enhancement in which we first train a modified vector-quantized variational autoencoder that solves a source separation task. We then use this trained autoencoder to further train an enhancement network using real-world noisy speech data by computing a triplet-based unsupervised loss function. Experiments show promising results for incorporating real-world data in training speech enhancement systems.

0
0
下载
预览
父主题
Top