This paper investigates the theoretical foundations of the t-distributed stochastic neighbor embedding (t-SNE) algorithm, a popular nonlinear dimension reduction and data visualization method. A novel theoretical framework for the analysis of t-SNE based on the gradient descent approach is presented. For the early exaggeration stage of t-SNE, we show its asymptotic equivalence to power iterations based on the underlying graph Laplacian, characterize its limiting behavior, and uncover its deep connection to Laplacian spectral clustering, and fundamental principles including early stopping as implicit regularization. The results explain the intrinsic mechanism and the empirical benefits of such a computational strategy. For the embedding stage of t-SNE, we characterize the kinematics of the low-dimensional map throughout the iterations, and identify an amplification phase, featuring the intercluster repulsion and the expansive behavior of the low-dimensional map, and a stabilization phase. The general theory explains the fast convergence rate and the exceptional empirical performance of t-SNE for visualizing clustered data, brings forth interpretations of the t-SNE visualizations, and provides theoretical guidance for applying t-SNE and selecting its tuning parameters in various applications.
翻译:本文调查了T-SNE(t-SNE)分布式随机相邻嵌入算法(t-SNE)的理论基础,这是一种流行的非线性尺寸递减和数据可视化方法。介绍了基于梯度下降法分析t-SNE的新型理论框架。对于t-SNE的早期夸大阶段,我们展示了它与基于Laplacian 基本图的动力迭代的无线等同性,其限制行为的特点,并揭示了它与Laplaceian光谱集成的深层联系,以及基本原则,包括早期停止作为隐含的正规化。结果解释了这种计算战略的内在机制和实验效益。对于t-SNE的嵌入阶段,我们描述了整个迭代图中低维图的动态,并确定了一个振动阶段,其特点是集群间反作用和低度地图的扩张行为,以及一个稳定阶段。一般理论解释了T-SNE的快速趋同率和特异性实验性性表现,用以对集群数据进行可视化、理论性调整,提供了对各种NE的理论应用的解释。