Kernel methods are powerful tools in various data analysis tasks. Yet, in many cases, their time and space complexity render them impractical for large datasets. Various kernel approximation methods were proposed to overcome this issue, with the most prominent method being the Nystr{\"o}m method. In this paper, we derive a perturbation-based kernel approximation framework building upon results from classical perturbation theory. We provide an error analysis for this framework, and prove that in fact, it generalizes the Nystr{\"o}m method and several of its variants. Furthermore, we show that our framework gives rise to new kernel approximation schemes, that can be tuned to take advantage of the structure of the approximated kernel matrix. We support our theoretical results numerically and demonstrate the advantages of our approximation framework on both synthetic and real-world data.
翻译:内核法是各种数据分析任务中的有力工具。 然而,在许多情况下,它们的时间和空间复杂性使得大型数据集不切实际。 提出了各种内核近似法来克服这一问题,其中最突出的方法是Nystr o'm 法。 在本文中,我们根据古典扰动理论的结果,得出一个以扰动为基础的内核近似框架。 我们对这一框架提供了错误分析,并证明事实上它概括了Nystr o'm 法及其若干变体。 此外,我们表明,我们的框架产生了新的内核近似办法,可以加以调整,以利用近似内核矩阵的结构。我们从数字上支持我们的理论结果,并展示了我们的近核框架在合成和现实世界数据方面的优势。