随机傅里叶特征对于保持核距离的相对误差 (On The Relative Error of Random Fourier Features for Preserving Kernel Distance)

The method of random Fourier features (RFF), proposed in a seminal paper by Rahimi and Recht (NIPS'07), is a powerful technique to find approximate low-dimensional representations of points in (high-dimensional) kernel space, for shift-invariant kernels. While RFF has been analyzed under various notions of error guarantee, the ability to preserve the kernel distance with \emph{relative} error is less understood. We show that for a significant range of kernels, including the well-known Laplacian kernels, RFF cannot approximate the kernel distance with small relative error using low dimensions. We complement this by showing as long as the shift-invariant kernel is analytic, RFF with $\mathrm{poly}(\epsilon^{-1} \log n)$ dimensions achieves $\epsilon$-relative error for pairwise kernel distance of $n$ points, and the dimension bound is improved to $\mathrm{poly}(\epsilon^{-1}\log k)$ for the specific application of kernel $k$-means. Finally, going beyond RFF, we make the first step towards data-oblivious dimension-reduction for general shift-invariant kernels, and we obtain a similar $\mathrm{poly}(\epsilon^{-1} \log n)$ dimension bound for Laplacian kernels. We also validate the dimension-error tradeoff of our methods on simulated datasets, and they demonstrate superior performance compared with other popular methods including random-projection and Nystr\"{o}m methods.

翻译：随机 Fourier 特征 (RFF) 方法是 Rahimi 和 Recht 在 NIPS'07 的一篇重要论文中提出的一种强大的技术，用于找到表示核空间中点的近似低维表示，适用于移不变的核。尽管 RFF 已经在多种误差保证概念下进行了分析，但是保持核距离的相对误差的能力仍不为人所知。我们表明，在包括著名的 Laplacian 核在内的许多核的显著范围内，使用低维度的 RFF 无法用小的相对误差逼近核距离。我们补充说明，只要移不变核是解析的，使用$\mathrm {poly}(\epsilon^{-1} \log n)$ 维度对于 n 个点的成对核距离，RFF 就可以实现 $\epsilon$ 相对误差，并且对于特定的应用程序核 k-均值，维度界限可改进为$\mathrm {poly}(\epsilon^{-1} \log k)$。最后，超越 RFF，我们迈出了通往一般移不变核的数据无关降维的第一步，并获得了 Laplacian 核的类似$\mathrm {poly}(\epsilon^{-1} \log n)$ 的维度界限。我们还在模拟数据集上验证了我们方法的维度-误差权衡，并且它们展示了与其他流行方法（包括随机投影和 Nyström 方法）相比的优越性能。