Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we prove that the spectral distribution of every positive definite isotropic kernel can be decomposed as a scale mixture of $\alpha$-stable random vectors, and we identify the scaling distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for every multivariate positive definite shift-invariant kernel, including exponential power kernels, generalized Mat\'ern kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we show that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution. This provides a very simple way to adapt existing random Fourier features software based on Gaussian kernels to any positive definite shift-invariant kernel. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.
翻译:暂无翻译