A large part of the literature on learning disentangled representations focuses on variational autoencoders (VAE). Recent developments demonstrate that disentanglement cannot be obtained in a fully unsupervised setting without inductive biases on models and data. However, Khemakhem et al., AISTATS, 2020 suggest that employing a particular form of factorized prior, conditionally dependent on auxiliary variables complementing input observations, can be one such bias, resulting in an identifiable model with guarantees on disentanglement. Working along this line, we propose a novel VAE-based generative model with theoretical guarantees on identifiability. We obtain our conditional prior over the latents by learning an optimal representation, which imposes an additional strength on their regularization. We also extend our method to semi-supervised settings. Experimental results indicate superior performance with respect to state-of-the-art approaches, according to several established metrics proposed in the literature on disentanglement.
翻译:关于学习分解的表达方式的文献大部分侧重于变异自动变相器(VAE),最近的事态发展表明,在完全无人监督的环境中,没有模型和数据的诱导偏见,就无法在完全不受监督的环境中获得分解。然而,Khemakhem等人、AISTATS, 2020年指出,采用某种特定形式的先入为主的因子化,有条件地依赖补充投入观察的辅助变量,可能是这种偏向,从而形成一种可识别的模型,保证分解不相容。按照这一思路,我们提出一个新的基于VAE的遗传模型,在理论上保证可辨别性。我们通过学习一种最佳的表达方式,从而获得对潜伏的有条件的先入为主。我们还将我们的方法扩大到半受监督的环境。实验结果表明,根据关于分解的文献中提议的几项既定指标,在最先进的方法方面表现优异。