The generation of discontinuous distributions is a difficult task for most known frameworks such as generative autoencoders and generative adversarial networks. Generative non-invertible models are unable to accurately generate such distributions, require long training and often are subject to mode collapse. Variational autoencoders (VAEs), which are based on the idea of keeping the latent space to be Gaussian for the sake of a simple sampling, allow an accurate reconstruction, while they experience significant limitations at generation task. In this work, instead of trying to keep the latent space to be Gaussian, we use a pre-trained contrastive encoder to obtain a clustered latent space. Then, for each cluster, representing a unimodal submanifold, we train a dedicated low complexity network to generate this submanifold from the Gaussian distribution. The proposed framework is based on the information-theoretic formulation of mutual information maximization between the input data and latent space representation. We derive a link between the cost functions and the information-theoretic formulation. We apply our approach to synthetic 2D distributions to demonstrate both reconstruction and generation of discontinuous distributions using continuous stochastic networks.
翻译:生成不连续的分布是大多数已知框架,如基因自动显示器和基因对抗网络的一项艰巨任务。 生成的非垂直模型无法准确生成这种分布,需要长期培训,而且往往会发生模式崩溃。 变化式自动显示器(VAE),其基础是保持潜在空间为高萨,以便进行简单取样,从而进行准确的重建,同时在生成任务中遇到重大限制。 在这项工作中,我们不是试图保持潜在空间为高斯人,而是使用预先训练的对比编码器获得一个集群的隐蔽空间。 然后,我们为代表单式子元的每个组群,培训一个专门的低复杂性网络,以生成高斯分布的这一亚项。 拟议的框架基于信息理论,即信息最大化在输入数据与潜在空间代表之间形成一个信息最大化。 我们从成本函数与信息理论配方之间获取联系。 我们用我们的方法对合成的2D分布方法, 以演示合成的网络的重建和生成, 连续的不连续的不连续的分布。