TC-VAE: 揭示数据生成因素中的超出分布数据 (TC-VAE: Uncovering Out-of-Distribution Data Generative Factors)

Uncovering data generative factors is the ultimate goal of disentanglement learning. Although many works proposed disentangling generative models able to uncover the underlying generative factors of a dataset, so far no one was able to uncover OOD generative factors (i.e., factors of variations that are not explicitly shown on the dataset). Moreover, the datasets used to validate these models are synthetically generated using a balanced mixture of some predefined generative factors, implicitly assuming that generative factors are uniformly distributed across the datasets. However, real datasets do not present this property. In this work we analyse the effect of using datasets with unbalanced generative factors, providing qualitative and quantitative results for widely used generative models. Moreover, we propose TC-VAE, a generative model optimized using a lower bound of the joint total correlation between the learned latent representations and the input data. We show that the proposed model is able to uncover OOD generative factors on different datasets and outperforms on average the related baselines in terms of downstream disentanglement metrics.

翻译：---- 摘要：揭示数据生成因素是分离学习的最终目标。尽管许多研究提出了分离生成模型，能够揭示数据集的潜在生成因素，但至今还没有人能够揭示OOD（即数据集上没有明确显示的变化因素）的生成因素。此外，用于验证这些模型的数据集是使用某些预定义的生成因素的平衡混合物合成的，隐含地假定生成因素在数据集中是均匀分布的。但是，真实数据集不具备这个属性。在这项工作中，我们分析了使用生成因素不平衡的数据集的影响，并为广泛使用的生成模型提供定性和定量结果。此外，我们提出TC-VAE，这是一种生成模型，使用学习的潜在表示和输入数据之间的联合总相关性的下界进行优化。我们展示了所提出的模型能够在不同的数据集上揭示OOD生成因素，并在下游分离度量方面的平均表现优于相关基线方法。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NUS-Xavier教授】生成模型VAE与GAN，69页ppt

专知会员服务

74+阅读 · 2022年4月6日

【ICLR 2022】MIT论文解读：谈到人工智能，我们可以抛弃数据集吗？基于ML创建合成数据，Generative Models As A Data Source For Multiview Representation Learning

专知会员服务

41+阅读 · 2022年3月15日

生成式对抗网络异常检测，GANs for Anomaly Detection

专知会员服务

34+阅读 · 2021年9月16日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日