与结构化非线性ICA脱离噪音数据中的可辨别特性 (Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA)

We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. While previous works have shown identifiability for specific classes of time-series models, our theorems extend this to more general temporal structures as well as to models with more complex structures such as spatial dependencies. In particular, we establish the major result that identifiability for this framework holds even in the presence of noise of unknown distribution. The SNICA setting therefore subsumes all the existing nonlinear ICA models for time-series and also allows for new much richer identifiable models. Finally, as an example of our framework's flexibility, we introduce the first nonlinear ICA model for time-series that combines the following very useful properties: it accounts for both nonstationarity and autocorrelation in a fully unsupervised setting; performs dimensionality reduction; models hidden states; and enables principled estimation and inference by variational maximum-likelihood.

翻译：我们引入了一个新的可识别的原则解析框架,称为结构化非线性独立组成部分分析(SNICA),我们的贡献是扩大一个非常广泛的结构型模型的深基因模型的可识别性理论。虽然以前的工作已经表明具体类别的时间序列模型的可识别性,但我们的理论将这一理论扩展至更一般性的时间结构以及空间依赖等更复杂结构的模型。特别是,我们确立了这一框架的可识别性即使在有未知分布的噪音的情况下仍具有的主要结果。因此,SNICA设置将所有现有的非线性ICA模型用于时间序列,并允许新的更丰富的可识别模型。最后,作为我们框架灵活性的一个例子,我们引入了第一个非线性ICA时间序列模型,将以下非常有用的属性结合起来:它既考虑到不常态性,又考虑到在完全不严密的环境下的自动关系;进行维度的减少;模型隐藏状态;并且能够通过最大变式的极限进行有原则的估计和推导。