Manifold learning methods play a prominent role in nonlinear dimensionality reduction and other tasks involving high-dimensional data sets with low intrinsic dimensionality. Many of these methods are graph-based: they associate a vertex with each data point and a weighted edge with each pair. Existing theory shows that the Laplacian matrix of the graph converges to the Laplace-Beltrami operator of the data manifold, under the assumption that the pairwise affinities are based on the Euclidean norm. In this paper, we determine the limiting differential operator for graph Laplacians constructed using $\textit{any}$ norm. Our proof involves an interplay between the second fundamental form of the manifold and the convex geometry of the given norm's unit ball. To demonstrate the potential benefits of non-Euclidean norms in manifold learning, we consider the task of mapping the motion of large molecules with continuous variability. In a numerical simulation we show that a modified Laplacian eigenmaps algorithm, based on the Earthmover's distance, outperforms the classic Euclidean Laplacian eigenmaps, both in terms of computational cost and the sample size needed to recover the intrinsic geometry.
翻译:Manide 学习方法在非线性维度减少和其他任务中发挥着突出作用, 涉及具有低内在维度的高维数据集。 许多这些方法都是基于图形的: 它们将顶点与每个数据点和加权边缘与每对对联系起来。 现有的理论显示, 图形的拉普拉西亚矩阵与数据元的Laplace- Beltrami操作员汇合, 假设对称的亲近性以欧洲的规范为基础。 在本文中, 我们确定使用 $\ textit{any} 规范构建的 Laplacian 图形的限值差值运算器。 我们的证据涉及二个基本形式的元和给定规范单位球的 convex几何性之间的相互作用。 要显示非ELlocidean 规范在多重学习中的潜在好处, 我们考虑对具有持续变异性的大分子运动进行绘图的任务 。 在数字模拟中, 我们显示, 以地球覆盖的距离为基础, 超越了经典的 Euclideidemaphisal 计算法所需的精度成本 。