Multiview data contain information from multiple modalities and have potentials to provide more comprehensive features for diverse machine learning tasks. A fundamental question in multiview analysis is what is the additional information brought by additional views and can quantitatively identify this additional information. In this work, we try to tackle this challenge by decomposing the entangled multiview features into shared latent representations that are common across all views and private representations that are specific to each single view. We formulate this feature disentanglement in the framework of information bottleneck and propose disentangled variational information bottleneck (DVIB). DVIB explicitly defines the properties of shared and private representations using constrains from mutual information. By deriving variational upper and lower bounds of mutual information terms, representations are efficiently optimized. We demonstrate the shared and private representations learned by DVIB well preserve the common labels shared between two views and unique labels corresponding to each single view, respectively. DVIB also shows comparable performance in classification task on images with corruptions. DVIB implementation is available at https://github.com/feng-bao-ucsf/DVIB.
翻译:多视角数据包含来自多种模式的信息,并有可能为各种机器学习任务提供更全面的特征。多视角分析中的一个基本问题是,更多观点带来的额外信息是什么,能够从数量上识别这种补充信息。在这项工作中,我们试图通过将纠缠的多视角特征分解为每个观点特有的所有观点和私人代表所共有的共同潜在代表形式来应对这一挑战。我们在信息瓶颈框架内提出这一特征的分解,并提议分解变异信息瓶颈(DVIB)。DVIB明确定义了使用相互信息限制的共享和私人代表的属性。通过得出相互信息条件的上下下游的变式,演示得到了高效优化。我们展示了DVIB所学到的共享和私人代表形式,从而分别维护了两种观点之间共享的通用标签和与每种观点相应的独特标签。DVIB还显示了腐败图像分类任务的可比性表现。DVIB的实施情况见https://github.com/feng-ucsf/DVIB。