Kernel methods are used frequently in various applications of machine learning. For large-scale applications, the success of kernel methods hinges on the ability to operate certain large dense kernel matrix K. To reduce the computational cost, Nystrom methods can efficiently compute a low-rank approximation to a symmetric positive semi-definite (SPSD) matrix K through landmark points and many variants have been developed in the past few years. For indefinite kernels, however, it has not even been justified whether Nystrom approximations are applicable. In this paper, we study for the first time, both theoretically and numerically, the Nystrom method for approximating general symmetric kernels, including indefinite ones. We first develop a unified theoretical framework for analyzing Nystrom approximations, which is valid for both SPSD and indefinite kernels and is independent of the specific scheme for selecting landmark points. To address the accuracy and numerical stability issues in Nystrom approximation, we then study the impact of data geometry on the spectral property of the corresponding kernel matrix and leverage the discrepancy theory to propose the anchor net method for computing Nystrom approximations. The anchor net method operates entirely on the dataset without requiring the access to K or its matrix-vector product and scales linearly for both SPSD and indefinite kernel matrices. Extensive numerical experiments suggest that indefinite kernels are much more challenging than SPSD kernels and most existing methods will suffer from numerical instability. Results on various kinds of kernels and machine learning datasets demonstrate that the new method resolves the numerical instability and achieves better accuracy with smaller computation costs compared to the state-of-the-art Nystrom methods.
翻译:在机器学习的各种应用中经常使用内核方法。对于大规模应用而言,内核方法的成功取决于操作某些大型密集内核矩阵的能力。为了降低计算成本,Nystrom方法可以有效地通过里程碑点和许多变式,将低端近似近似值计算成对称正正半半不完全矩阵(SPSD)矩阵K,这在过去几年中是有效的,并且已经开发了多种变式。但是,对于无限期的内核而言,Nystrom近似是否适用甚至没有道理。在本文中,我们第一次从理论上和数字上研究关于接近内核内核内核内核内核内核的近效能力。Nystrom方法近似于稳定性一般内核内核内核的近似值。我们首先开发一个用于分析Nystrommock might的低端近效近效近效理论框架,而对于SPSD和内核内核内核内核内核内核内核内核内核的近度方法则更具有说服力性。我们随后将研究数据测测测测算对内核内核内核内核内核内核内核内核内核内核内核的内核的内核内核内核的内核的近法和内核内核内核的内核的内核内核的内核的内核的内核、内核内核内核内核内核内核内核内核内核内核内核内核的更能的更能推算法。