通过 MManifle Learn 学习多视图数据可视化 (Multi-view Data Visualisation via Manifold Learning)

Manifold learning approaches, such as Stochastic Neighbour Embedding (SNE), Locally Linear Embedding (LLE) and Isometric Feature Mapping (ISOMAP) have been proposed for performing non-linear dimensionality reduction. These methods aim to produce two or three latent embeddings, in order to visualise the data in intelligible representations. This manuscript proposes extensions of Student's t-distributed SNE (t-SNE), LLE and ISOMAP, to allow for dimensionality reduction and subsequent visualisation of multi-view data. Nowadays, it is very common to have multiple data-views on the same samples. Each data-view contains a set of features describing different aspects of the samples. For example, in biomedical studies it is possible to generate multiple OMICS data sets for the same individuals, such as transcriptomics, genomics, epigenomics, enabling better understanding of the relationships between the different biological processes. Through the analysis of real and simulated datasets, the visualisation performance of the proposed methods is illustrated. Data visualisations have been often utilised for identifying any potential clusters in the data sets. We show that by incorporating the low-dimensional embeddings obtained via the multi-view manifold learning approaches into the K-means algorithm, clusters of the samples are accurately identified. Our proposed multi-SNE method outperforms the corresponding multi-ISOMAP and multi-LLE proposed methods. Interestingly, multi-SNE is found to have comparable performance with methods proposed in the literature for performing multi-view clustering.

翻译：为了进行非线性维度的减少,提议了Manidex 学习方法,如Stochastic 邻居嵌入式(SNE)、本地线性嵌入式(LLE)和Isomatic 地貌映射(ISOMAP),以进行非线性维度的减少。这些方法旨在产生两三个潜在的嵌入式。这些方法的目的是为同一人制作两个或三个潜在的嵌入式,以便以可理解的表示方式对数据进行可视化。本稿建议扩展学生的T-分布式 SNE(t-SNE)、LLE和ISOMAP(ISOMAP),以使人们更好地了解拟议的不同生物进程之间的关系。通过对真实和模拟的数据集进行分析,现在非常常见的是在同一样品上多维度上多维度的图像。每个数据视图都包含一组描述样本的不同特征。例如,在生物医学研究中,有可能为同一人制作多个OMICS数据集,如路谱、基因组、缩入式、缩略图,我们所发现的多维值的模型中,我们所选取的解式的解式的解式方法往往通过多维化方法来显示。

相关内容

流形学习

关注 345

流形学习，全称流形学习方法(Manifold Learning)，自2000年在著名的科学杂志《Science》被首次提出以来，已成为信息科学领域的研究热点。在理论和应用上，流形学习方法都具有重要的研究意义。假设数据是均匀采样于一个高维欧氏空间中的低维流形，流形学习就是从高维采样数据中恢复低维流形结构，即找到高维空间中的低维流形，并求出相应的嵌入映射，以实现维数约简或者数据可视化。它是从观测到的现象中去寻找事物的本质，找到产生数据的内在规律。

专知会员服务

39+阅读 · 2020年11月3日