Non-linear dimensionality reduction can be performed by \textit{manifold learning} approaches, such as Stochastic Neighbour Embedding (SNE), Locally Linear Embedding (LLE) and Isometric Feature Mapping (ISOMAP). These methods aim to produce two or three latent embeddings, primarily to visualise the data in intelligible representations. This manuscript proposes extensions of Student's t-distributed SNE (t-SNE), LLE and ISOMAP, for dimensionality reduction and visualisation of multi-view data. Multi-view data refers to multiple types of data generated from the same samples. The proposed multi-view approaches provide more comprehensible projections of the samples compared to the ones obtained by visualising each data-view separately. Commonly visualisation is used for identifying underlying patterns within the samples. By incorporating the obtained low-dimensional embeddings from the multi-view manifold approaches into the K-means clustering algorithm, it is shown that clusters of the samples are accurately identified. Through the analysis of real and synthetic data the proposed multi-SNE approach is found to have the best performance. We further illustrate the applicability of the multi-SNE approach for the analysis of multi-omics single-cell data, where the aim is to visualise and identify cell heterogeneity and cell types in biological tissues relevant to health and disease.
翻译:非线性维度的减少可以通过“Textit{manfoldlearning”方法进行,例如“Stochastestic neconnection nebedition” (SNE)、“局部线性嵌入” (LLLE) 和“Isoter Feature Conta” (ISOMA) 等方法进行。这些方法旨在产生两三个潜在的嵌入层,主要是为了以可理解的表示方式对数据进行可视化。本稿建议扩展“学生的T-SNE”(T-SNE)、LLE和ISOMAP”,用于多视图数据的维度减少和可视化。多视图数据是指从同一样本中产生的多种数据类型。提议的多视图方法提供了与通过对每个数据视图分别进行可视化而获得的样本相比更为易懂的样本预测。共同可视化用于确定样本中的基本模式。通过将获得的多视化的SNE(t-S)、多角度组合算算法,可以准确识别这些样本的组群集。通过对真实和合成数据进行分析,发现“多面-SNE” 和多面细胞分析,我们进一步说明他在生物细胞分析中找到的多目的的多角度分析。