We propose a new technique for constructing low-rank approximations of matrices that arise in kernel methods for machine learning. Our approach pairs a novel automatically constructed analytic expansion of the underlying kernel function with a data-dependent compression step to further optimize the approximation. This procedure works in linear time and is applicable to any isotropic kernel. Moreover, our method accepts the desired error tolerance as input, in contrast to prevalent methods which accept the rank as input. Experimental results show our approach compares favorably to the commonly used Nystrom method with respect to both accuracy for a given rank and computational time for a given accuracy across a variety of kernels, dimensions, and datasets. Notably, in many of these problem settings our approach produces near-optimal low-rank approximations. We provide an efficient open-source implementation of our new technique to complement our theoretical developments and experimental results.
翻译:我们提出了一种在机器学习的内核方法中生成的低级别矩阵近似值构建新技术。 我们的方法将新颖的自动构建的内核功能分析扩展与数据依赖压缩步骤相配,以进一步优化近似值。 这个程序在线性时间里运作,适用于任何异热带内核。 此外,我们的方法接受理想的差错容忍度作为输入,这与接受作为输入的等级的普遍方法相比。 实验结果显示,我们的方法优于常用的Nystrom方法,既适合给定等级的准确性,也有利于计算各种内核、尺寸和数据集的准确性。 值得注意的是,在很多这样的问题环境中,我们的方法产生接近最佳的低级近似值。 我们高效地以开放源方式实施我们的新技术,以补充我们的理论发展和实验结果。