Spectral methods which represent data points by eigenvectors of kernel matrices or graph Laplacian matrices have been a primary tool in unsupervised data analysis. In many application scenarios, parametrizing the spectral embedding by a neural network that can be trained over batches of data samples gives a promising way to achieve automatic out-of-sample extension as well as computational scalability. Such an approach was taken in the original paper of SpectralNet (Shaham et al. 2018), which we call SpecNet1. The current paper introduces a new neural network approach, named SpecNet2, to compute spectral embedding which optimizes an equivalent objective of the eigen-problem and removes the orthogonalization layer in SpecNet1. SpecNet2 also allows separating the sampling of rows and columns of the graph affinity matrix by tracking the neighbors of each data point through the gradient formula. Theoretically, we show that any local minimizer of the new orthogonalization-free objective reveals the leading eigenvectors. Furthermore, global convergence for this new orthogonalization-free objective using a batch-based gradient descent method is proved. Numerical experiments demonstrate the improved performance and computational efficiency of SpecNet2 on simulated data and image datasets.
翻译:代表内核基质或拉普拉西安基质等离子体数据点的光谱方法代表了内核基质或图式拉普拉西亚基质的数据点。 在许多应用情景中,通过能够通过批量数据样本培训的神经网络的光谱嵌入为实现自动隔热扩展以及计算缩缩放能力提供了有希望的方法。 这种方法在SpectralNet(Shaham et al. 2018)的原始文件中(我们称之为 SpecNet1) 采取。 目前的文件引入了新的神经网络方法,名为SpecNet2, 以计算最优化等离子-问题目标的光谱嵌入,并删除SpecNet1 中的相应神经网络嵌入层。 SpecNet2 还允许通过梯度公式跟踪每个数据点的相邻点的相邻区域。 从理论上,我们显示,新或无孔化目标的本地最小最小最小最小最小化显示显示显示显示显示主要结构图象源2的光谱嵌嵌嵌化方法。 此外,使用新的磁度数据级计算方法,也证明,以新的磁度模型化数据级计算方法显示,并显示新的磁级化数据级化数据级化。