Spectral clustering and diffusion maps are celebrated dimensionality reduction algorithms built on eigen-elements related to the diffusive structure of the data. The core of these procedures is the approximation of a Laplacian through a graph kernel approach, however this local average construction is known to be cursed by the high-dimension d. In this article, we build a different estimator of the Laplacian, via a reproducing kernel Hilbert space method, which adapts naturally to the regularity of the problem. We provide non-asymptotic statistical rates proving that the kernel estimator we build can circumvent the curse of dimensionality. Finally we discuss techniques (Nystr\"om subsampling, Fourier features) that enable to reduce the computational cost of the estimator while not degrading its overall performance.
翻译:光谱集群和传播地图是人们庆祝的维度减少算法,这些算法是建立在与数据差异结构有关的电子元素之上的。这些程序的核心是通过图形内核法近似拉平岩层,然而,据了解,这种局部平均构造受到高差异的诅咒。在本篇文章中,我们通过复制内核Hilbert空间方法为拉平层建立一个不同的测算器,该方法自然地适应问题的规律性。我们提供了非简易的统计率,证明我们建立的测算器可以绕过维度的诅咒。最后,我们讨论了能够降低测算器计算成本同时又不降低其总体性能的技术(Nystrle'om次抽样、Fourier特性 )。