We propose a notion of common information that allows one to quantify and separate the information that is shared between two random variables from the information that is unique to each. Our notion of common information is a variational relaxation of the G\'acs-K\"orner common information, which we recover as a special case, but is more amenable to optimization and can be approximated empirically using samples from the underlying distribution. We then provide a method to partition and quantify the common and unique information using a simple modification of a traditional variational auto-encoder. Empirically, we demonstrate that our formulation allows us to learn semantically meaningful common and unique factors of variation even on high-dimensional data such as images and videos. Moreover, on datasets where ground-truth latent factors are known, we show that we can accurately quantify the common information between the random variables. Additionally, we show that the auto-encoder that we learn recovers semantically meaningful disentangled factors of variation, even though we do not explicitly optimize for it.
翻译:我们提出了一个共同信息的概念,允许人们量化和分离两个随机变量之间共享的信息,而这些信息是每个变量独有的信息。我们的共同信息概念是G\'acs-K\'orner共同信息的变异放松,我们作为特例恢复了这些信息,但更适合优化,并且可以使用原始分布的样本进行经验上的比较。我们然后提供一种方法,利用传统变异自动编码器的简单修改来分割和量化共同和独特的信息。我们生动地表明,我们的配方使我们能够学习即使是在图像和视频等高维数据上也具有实际意义的共同和独特的变异因素。此外,在已知地真暗潜伏因素的数据集上,我们表明我们可以准确地量化随机变量之间的共同信息。此外,我们展示了我们学习的解变异因素的自动编码,尽管我们并未明确优化这些变异因素。