Latent variables (LVs) play a crucial role in encoder-decoder models by enabling effective data compression, prediction, and generation. Although their theoretical properties, such as generalization, have been extensively studied in supervised learning, similar analyses for unsupervised models such as variational autoencoders (VAEs) remain insufficiently underexplored. In this work, we extend information-theoretic generalization analysis to vector-quantized (VQ) VAEs with discrete latent spaces, introducing a novel data-dependent prior to rigorously analyze the relationship among LVs, generalization, and data generation. We derive a novel generalization error bound of the reconstruction loss of VQ-VAEs, which depends solely on the complexity of LVs and the encoder, independent of the decoder. Additionally, we provide the upper bound of the 2-Wasserstein distance between the distributions of the true data and the generated data, explaining how the regularization of the LVs contributes to the data generation performance.
翻译:潜在变量在编码器-解码器模型中通过实现有效的数据压缩、预测和生成发挥着关键作用。尽管其理论性质(如泛化性)在监督学习中已得到广泛研究,但对于变分自编码器等无监督模型的类似分析仍显不足。本研究将信息论泛化分析扩展到具有离散潜在空间的向量量化变分自编码器,引入一种新颖的数据依赖先验分布,以严格分析潜在变量、泛化性与数据生成之间的关系。我们推导出VQ-VAE重构损失的新泛化误差界,该误差界仅取决于潜在变量与编码器的复杂度,而与解码器无关。此外,我们给出了真实数据分布与生成数据分布之间2-Wasserstein距离的上界,阐释了潜在变量正则化如何影响数据生成性能。