Random Fourier features is a widely used, simple, and effective technique for scaling up kernel methods. The existing theoretical analysis of the approach, however, remains focused on specific learning tasks and typically gives pessimistic bounds which are at odds with the empirical results. We tackle these problems and provide the first unified risk analysis of learning with random Fourier features using the squared error and Lipschitz continuous loss functions. In our bounds, the trade-off between the computational cost and the expected risk convergence rate is problem specific and expressed in terms of the regularization parameter and the \emph{number of effective degrees of freedom}. We study both the standard random Fourier features method for which we improve the existing bounds on the number of features required to guarantee the corresponding minimax risk convergence rate of kernel ridge regression, as well as a data-dependent modification which samples features proportional to \emph{ridge leverage scores} and further reduces the required number of features. As ridge leverage scores are expensive to compute, we devise a simple approximation scheme which provably reduces the computational cost without loss of statistical efficiency.
翻译:随机的Fourier特征是一种广泛使用、简单和有效的扩大内核的方法。但是,目前对这种方法的理论分析仍然侧重于具体的学习任务,通常会给出与经验结果相矛盾的悲观界限。我们处理这些问题,并且利用方差和Lipschitz连续损失函数,以随机的Fourier特征对学习进行首次统一的风险分析。在我们的范围中,计算成本和预期风险集中率之间的权衡是特定的问题,并以正规化参数和有效自由度的 emph{数量表示。我们研究了标准的随机Fourier特征方法,为此我们改进了现有参数的界限,以保证相应的内核脊回归最小风险趋同率,以及根据数据进行的修改,其样本与 emph{ridge} 杠杆评分成正比,并进一步减少了所需的特征数量。由于脊峰的评分非常昂贵,我们设计了一个简单的近似方法,可以在不影响统计效率损失的情况下降低计算成本。