The theory of identifiable representation learning aims to build general-purpose methods that extract high-level latent (causal) factors from low-level sensory data. Most existing works focus on identifiable representation learning with observational data, relying on distributional assumptions on latent (causal) factors. However, in practice, we often also have access to interventional data for representation learning. How can we leverage interventional data to help identify high-level latents? To this end, we explore the role of interventional data for identifiable representation learning in this work. We study the identifiability of latent causal factors with and without interventional data, under minimal distributional assumptions on the latents. We prove that, if the true latent variables map to the observed high-dimensional data via a polynomial function, then representation learning via minimizing the standard reconstruction loss of autoencoders identifies the true latents up to affine transformation. If we further have access to interventional data generated by hard $do$ interventions on some of the latents, then we can identify these intervened latents up to permutation, shift and scaling.
翻译:可识别代表性学习理论旨在建立从低感官数据中提取高潜值(因果)因素的通用方法。大多数现有工作侧重于通过观测数据进行可识别的代表性学习,依靠对潜在(因果)因素的分布假设。然而,在实践中,我们通常也能够获得用于代表学习的干预数据。我们如何利用干预数据帮助识别高潜值?为此,我们探索干预数据对于在这项工作中进行可识别的代表性学习的作用。我们研究了在对潜值的最低分布假设下,以最小的干预性数据为基础,使用和不使用潜在的潜在因果关系。我们证明,如果通过多元函数观测到的高维数据的真正潜在变量分布图,那么通过尽量减少自动编码者标准重建损失来进行代表学习,从而确定通向临界值转变的真正潜在可能性。如果我们能够进一步获得由某些潜值硬值干预产生的干预数据,我们就可以查明这些干预性潜在因素的可识别性,直至通融化、转移和扩展。