Kernel methods are a highly effective and widely used collection of modern machine learning algorithms. A fundamental limitation of virtually all such methods are computations involving the kernel matrix that naively scale quadratically (e.g., constructing the kernel matrix and matrix-vector multiplication) or cubically (solving linear systems) with the size of the data set $N.$ We propose the Fast Kernel Transform (FKT), a general algorithm to compute matrix-vector multiplications (MVMs) for datasets in moderate dimensions with quasilinear complexity. Typically, analytically grounded fast multiplication methods require specialized development for specific kernels. In contrast, our scheme is based on auto-differentiation and automated symbolic computations that leverage the analytical structure of the underlying kernel. This allows the FKT to be easily applied to a broad class of kernels, including Gaussian, Matern, and Rational Quadratic covariance functions and physically motivated Green's functions, including those of the Laplace and Helmholtz equations. Furthermore, the FKT maintains a high, quantifiable, and controllable level of accuracy -- properties that many acceleration methods lack. We illustrate the efficacy and versatility of the FKT by providing timing and accuracy benchmarks and by applying it to scale the stochastic neighborhood embedding (t-SNE) and Gaussian processes to large real-world data sets.
翻译:内核是高度有效和广泛使用的现代机器学习算法的收集方法。几乎所有这类方法的基本局限性都在于涉及内核矩阵的计算,而内核矩阵是天真规模(例如,建造内核矩阵和矩阵-矢量乘法)或以数据集大小为单位的立方(分辨率线性系统)的内核矩阵(例如,建造内核矩阵和矩阵-矢量乘法)或立方(分辨率线性系统)的内核矩阵,我们提议采用快速内核变换(FKT),这是计算具有准线性复杂性的中度数据集矩阵-矢量倍数乘法(MMVM)的一般算法。通常,基于分析的快速增益方法需要具体内核内核专门开发。相比之下,我们的计划基于自动差异和自动符号计算,利用基本内核内核的分析结构的规模。这样,FKT就可以很容易地对广泛的内核核心内核(包括高音、马泰尔和夸度变异性变异性功能和有物理动机的格林函数,包括Laplet和Hlhotz等等等等功能。此外,FKT级的内空和内空数据的精确度精确性能和精确度标准,通过高空基级的精确性能和精确度标准,使内空基级的精确性能和直度向内空基级的精确性能和直径性能和直径性能性能性向地标标标标度的精确性向FD。