We prove identifiability of a broad class of deep latent variable models that (a) have universal approximation capabilities and (b) are the decoders of variational autoencoders that are commonly used in practice. Unlike existing work, our analysis does not require weak supervision, auxiliary information, or conditioning in the latent space. Specifically, we show that for a broad class of generative (i.e. unsupervised) models with universal approximation capabilities, the side information $u$ is not necessary: We prove identifiability of the entire generative model where we do not observe $u$ and only observe the data $x$. The models we consider match autoencoder architectures used in practice that leverage mixture priors in the latent space and ReLU/leaky-ReLU activations in the encoder, such as VaDE and MFC-VAE. Our main result is an identifiability hierarchy that significantly generalizes previous work and exposes how different assumptions lead to different "strengths" of identifiability, and includes certain "vanilla" VAEs with isotropic Gaussian priors as a special case. For example, our weakest result establishes (unsupervised) identifiability up to an affine transformation, and thus partially resolves an open problem regarding model identifiability raised in prior work. These theoretical results are augmented with experiments on both simulated and real data.
翻译:我们证明,我们可以识别具有普遍近似能力的一大批深层潜伏变量模型,这些模型:(a) 具有普遍近似能力,(b) 是实际中常用的变异自动编码器的解码器。与现有工作不同,我们的分析不需要薄弱的监督、辅助信息或潜层空间的调节。具体地说,我们证明,对于具有普遍近似能力的一大批变异(即不受监督的)模型来说,侧边信息是不必要的:我们证明整个变异模型的可识别性,其中我们不观测美元,而只观测数据美元。我们认为,这些模型匹配了实践中所使用的自动编码器结构,这些结构利用潜层空间和RELU/LAKY-REU的混合前期,或对潜层空间进行调节。我们的主要结果是,对于具有普遍近似能力的多类变型模型(即无监督的),其侧边信息等级大大地概括了以往的工作,并暴露了不同的假设如何导致不同的“强化”可识别性,并且只观测到数据。我们认为,这些模型将某些“VAE值”和最易变现性实际变现式的变现结果作为先前的一个特别例子。