A Shared Nearest Neighbor (SNN) graph is a type of graph construction using shared nearest neighbor information, which is a secondary similarity measure based on the rankings induced by a primary $k$-nearest neighbor ($k$-NN) measure. SNN measures have been touted as being less prone to the curse of dimensionality than conventional distance measures, and thus methods using SNN graphs have been widely used in applications, particularly in clustering high-dimensional data sets and in finding outliers in subspaces of high dimensional data. Despite this, the theoretical study of SNN graphs and graph Laplacians remains unexplored. In this pioneering work, we make the first contribution in this direction. We show that large scale asymptotics of an SNN graph Laplacian reach a consistent continuum limit; this limit is the same as that of a $k$-NN graph Laplacian. Moreover, we show that the pointwise convergence rate of the graph Laplacian is linear with respect to $(k/n)^{1/m}$ with high probability.
翻译:共享近邻图形是一种图表结构,它使用共享的近邻信息进行图解构建,这是根据最近近邻(k$-NN)测量的排名而得出的二级相似性测量。 SNN措施被认为比常规距离测量更不易受到维度诅咒,因此,SNN图形的使用方法在应用中被广泛使用,特别是在高维数据集组群和在高维数据子空间中查找外源。尽管如此,对 SNN图形和拉普拉西安图的理论研究仍未进行探讨。在这项开创性工作中,我们首先向这个方向作出贡献。我们表明,SNNP图拉普拉提亚的大规模模拟达到一个一致的连续性限值;这一限值与$-NN图Laplacian的数值相同。此外,我们显示,Laplacian图的中点汇合率与$(k/n)1/m}高概率线。</s>