t-distributed stochastic neighbor embedding (t-SNE) is a well-established visualization method for complex high-dimensional data. However, the original t-SNE method is nonparametric, stochastic, and often cannot well prevserve the global structure of data as it emphasizes local neighborhood. With t-SNE as a reference, we propose to combine the deep neural network (DNN) with the mathematical-grounded embedding rules for high-dimensional data embedding. We first introduce a deep embedding network (DEN) framework, which can learn a parametric mapping from high-dimensional space to low-dimensional embedding. DEN has a flexible architecture that can accommodate different input data (vector, image, or tensor) and loss functions. To improve the embedding performance, a recursive training strategy is proposed to make use of the latent representations extracted by DEN. Finally, we propose a two-stage loss function combining the advantages of two popular embedding methods, namely, t-SNE and uniform manifold approximation and projection (UMAP), for optimal visualization effect. We name the proposed method Deep Recursive Embedding (DRE), which optimizes DEN with a recursive training strategy and two-stage losse. Our experiments demonstrated the excellent performance of the proposed DRE method on high-dimensional data embedding, across a variety of public databases. Remarkably, our comparative results suggested that our proposed DRE could lead to improved global structure preservation.
翻译:(t-SNE)是复杂的高维数据嵌入(t-SNE)的成熟的可视化方法。然而,最初的 t-SNE方法是非参数性、随机性、往往无法在强调当地邻里时充分保护全球数据结构。我们建议,以t-SNE为参照,将深神经网络(DNNN)与高维数据嵌入的基于数学的嵌入规则结合起来。我们首先引入一个深嵌入网络(DEN)框架,它可以从高维空间到低维内嵌入一个参数性图解映射图。丹麦有一个灵活的结构,可以容纳不同的输入数据数据数据(Vactor、图像或高压数据)和损失功能。为了改进嵌入性功能,我们建议了一个循环培训战略,我们提出了一个分两个阶段的亏损功能,将两种广受欢迎的嵌入方法(即,t-SNENE)和统一的多维近和投影(UMAP)的优点结合起来,用于最佳的可视化效果。我们提议的深层再定位的、深层再定位和深层再定位系统化的系统化全球战略,我们提议的深度再定位。我们提议的“DREDREDREDIDADAD”将提出一个最佳的计算方法,我们提出的一个最佳的升级方法,我们提议的“深层”的深度的升级的升级的升级方法,可以优化。