Kernel ridge regression (KRR) is widely used for nonparametric regression over reproducing kernel Hilbert spaces. It offers powerful modeling capabilities at the cost of significant computational costs, which typically require $O(n^3)$ computational time and $O(n^2)$ storage space, with the sample size n. We introduce a novel framework of multi-layer kernel machines that approximate KRR by employing a multi-layer structure and random features, and study how the optimal number of random features and layer sizes can be chosen while still preserving the minimax optimality of the approximate KRR estimate. For various classes of random features, including those corresponding to Gaussian and Matern kernels, we prove that multi-layer kernel machines can achieve $O(n^2\log^2n)$ computational time and $O(n\log^2n)$ storage space, and yield fast and minimax optimal approximations to the KRR estimate for nonparametric regression. Moreover, we construct uncertainty quantification for multi-layer kernel machines by using conformal prediction techniques with robust coverage properties. The analysis and theoretical predictions are supported by simulations and real data examples.
翻译:暂无翻译