Mixture models in variational inference (VI) is an active field of research. Recent works have established their connection to multiple importance sampling (MIS) through the MISELBO and advanced the use of ensemble approximations for large-scale problems. However, as we show here, an independent learning of the ensemble components can lead to suboptimal diversity. Hence, we study the effect of instead using MISELBO as an objective function for learning mixtures, and we propose the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network. Two major insights led to the construction of this novel composite model. First, mixture models have potential to be off-the-shelf tools for practitioners to obtain more flexible posterior approximations in VAEs. Therefore, we make them more accessible by demonstrating how to apply them to four popular architectures. Second, the mixture components cooperate in order to cover the target distribution while trying to maximize their diversity when MISELBO is the objective function. We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling. Finally, we demonstrate the superiority of the Mixture VAEs' learned feature representations on both image and single-cell transcriptome data, and obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets. Code available here: \url{https://github.com/Lagergren-Lab/MixtureVAEs}.
翻译:变异发酵(VI) 中的混合模型是一个积极的研究领域。最近的工作已经通过MISELBO建立了与多重重要抽样(MIS)的联系,并推进了对大规模问题的混合近似值的构建。然而,正如我们在这里所显示的那样,对混合成分的独立学习可能导致不优化的多样性。因此,我们研究使用MISELBO作为学习混合物的一个客观功能的效果。我们建议使用首次的变异近效组合,以便实现流基级变异自动coder(VAE)的正常化,与Vampprior和PixelCNN解码网络建立了联系。两种主要见解导致构建了这一新型综合模型。首先,混合模型有可能成为操作者在VAE中获取更灵活的近似近似近似效果。因此,我们通过演示如何将它们应用到四种流行结构中。第二,混合构件部分合作以覆盖目标分布,同时试图在MILBIO和PILBO中实现最大多样性,我们通过在IMLA IMA 和SA IML 图像中展示了该合作行为在VA IML 格式中获取了VL 和SAL IML 格式上的数据。