SpectralNet is a graph clustering method that uses neural network to find an embedding that separates the data. So far it was only used with $k$-nn graphs, which are usually constructed using a distance metric (e.g., Euclidean distance). $k$-nn graphs restrict the points to have a fixed number of neighbors regardless of the local statistics around them. We proposed a new SpectralNet similarity metric based on random projection trees (rpTrees). Our experiments revealed that SpectralNet produces better clustering accuracy using rpTree similarity metric compared to $k$-nn graph with a distance metric. Also, we found out that rpTree parameters do not affect the clustering accuracy. These parameters include the leaf size and the selection of projection direction. It is computationally efficient to keep the leaf size in order of $\log(n)$, and project the points onto a random direction instead of trying to find the direction with the maximum dispersion.
翻译:光谱网是一种图形集成方法,它使用神经网络来找到一种嵌入数据分离的嵌入方式。 到目前为止,它只用美元- nn 图形来使用。 这些图形通常使用远程测量(如欧洲clidean 距离)来构建。 $k$- nn 图形将点限制在固定的邻里数, 而不考虑周围的本地统计数据。 我们根据随机投影树( rpTrees) 提出了一个新的光谱网相似度度度指标。 我们的实验显示, 光谱网使用正方形相似度比以远度测量的美元- nn 图形来生成更好的组合精度。 我们还发现, rpTree 参数不会影响集精度。 这些参数包括叶大小和投影方向的选择。 将叶大小保持在$\log( n) 上是计算有效的, 并且将点投射到随机方向, 而不是试图用最大分散度找到方向。</s>