Diffusion maps is a manifold learning algorithm widely used for dimensionality reduction. Using a sample from a distribution, it approximates the eigenvalues and eigenfunctions of associated Laplace-Beltrami operators. Theoretical bounds on the approximation error are however generally much weaker than the rates that are seen in practice. This paper uses new approaches to improve the error bounds in the model case where the distribution is supported on a hypertorus. For the data sampling (variance) component of the error we make spatially localised compact embedding estimates on certain Hardy spaces; we study the deterministic (bias) component as a perturbation of the Laplace-Beltrami operator's associated PDE, and apply relevant spectral stability results. Using these approaches, we match long-standing pointwise error bounds for both the spectral data and the norm convergence of the operator discretisation. We also introduce an alternative normalisation for diffusion maps based on Sinkhorn weights. This normalisation approximates a Langevin diffusion on the sample and yields a symmetric operator approximation. We prove that it has better convergence compared with the standard normalisation on flat domains, and present a highly efficient algorithm to compute the Sinkhorn weights.
翻译:扩散图是一个多元的学习算法,广泛用于维度的减少。 它使用分布式样本, 接近相关 Laplace- Beltrami 操作员的偏移值和偏移功能。 但是, 近似误的理论界限通常比实际中看到的速度要弱得多。 本文使用新方法改进模型中错误界限的误差界限, 该模型中分布在超大图上支持该分布。 对于我们对某些Hardy 空间局部嵌入的缩放缩缩缩缩缩缩数据取样( 变量) 部分, 我们研究确定性( 偏移) 组件, 作为 Laplace- Beltrami 操作员的相关 PDE 的扰动, 并应用相关光谱稳定性结果。 使用这些方法, 我们匹配长期的点误差界限, 即光谱数据以及操作员离散的规范一致。 我们还采用基于Sinkhorn重量的传播地图的替代法常态( ) 。 这种正常化近似于样品的兰氏 扩散, 并产生一个测量操作员的近似近似。 我们证明它与标准高度趋同正的平的平的Sqraint 。