Real-world databases are complex and usually require dealing with heterogeneous and mixed data types making the exploitation of shared information between views a critical issue. For this purpose, recent studies based on deep generative models merge all views into a nonlinear complex latent space, which can share information among views. However, this solution limits the model's interpretability, flexibility, and modularity. We propose a novel method to overcome these limitations by combining multiple Variational AutoEncoders (VAE) with a Factor Analysis latent space (FA-VAE). We use VAEs to learn a private representation of each heterogeneous view in a continuous latent space. Then, we share the information between views by a low-dimensional latent space using a linear projection matrix. This way, we create a flexible and modular hierarchical dependency between private and shared information in which new views can be incorporated afterwards. Beyond that, we can condition pre-trained models, cross-generate data from different domains, and perform transfer learning between generative models.
翻译:现实世界数据库是复杂的,通常需要处理不同和混合的数据类型,使得利用各种观点之间的共享信息成为一个关键问题。为此,最近基于深层基因模型的研究将所有观点融合成一个非线性复杂潜伏空间,可以在各种观点之间分享信息。然而,这一解决方案限制了模型的解释性、灵活性和模块性。我们提出了一个克服这些限制的新办法,将多种变化式自动编码器(VAE)与系数分析潜在空间(FA-VAE)结合起来。我们利用VAEs在持续潜藏空间学习每种不同观点的私人代表。然后,我们利用线性预测矩阵,通过低维潜在空间分享各种观点之间的信息。这样,我们在私人信息和共享信息之间建立了灵活和模块式的等级依赖性,此后可以纳入新的观点。除此之外,我们可以对预先培训的模型、不同领域的跨基因数据进行条件化,并在基因模型之间进行传输学习。