Multimodal datasets contain observations generated by multiple types of sensors. Most works to date focus on uncovering latent structures in the data that appear in all modalities. However, important aspects of the data may appear in only one modality due to the differences between the sensors. Uncovering modality-specific attributes may provide insights into the sources of the variability of the data. For example, certain clusters may appear in the analysis of genetics but not in epigenetic markers. Another example is hyper-spectral satellite imaging, where various atmospheric and ground phenomena are detectable using different parts of the spectrum. In this paper, we address the problem of uncovering latent structures that are unique to a single modality. Our approach is based on computing a graph representation of datasets from two modalities and analyzing the differences between their connectivity patterns. We provide an asymptotic analysis of the convergence of our approach based on a product manifold model. To evaluate the performance of our method, we test its ability to uncover latent structures in multiple types of artificial and real datasets.
翻译:暂无翻译