Random features is one of the most popular techniques to speed up kernel methods in large-scale problems. Related works have been recognized by the NeurIPS Test-of-Time award in 2017 and the ICML Best Paper Finalist in 2019. The body of work on random features has grown rapidly, and hence it is desirable to have a comprehensive overview on this topic explaining the connections among various algorithms and theoretical results. In this survey, we systematically review the work on random features from the past ten years. First, the motivations, characteristics and contributions of representative random features based algorithms are summarized according to their sampling schemes, learning procedures, variance reduction properties and how they exploit training data. Second, we review theoretical results that center around the following key question: how many random features are needed to ensure a high approximation quality or no loss in the empirical/expected risks of the learned estimator. Third, we provide a comprehensive evaluation of popular random features based algorithms on several large-scale benchmark datasets and discuss their approximation quality and prediction performance for classification. Last, we discuss the relationship between random features and modern over-parameterized deep neural networks (DNNs), including the use of high dimensional random features in the analysis of DNNs as well as the gaps between current theoretical and empirical results. This survey may serve as a gentle introduction to this topic, and as a users' guide for practitioners interested in applying the representative algorithms and understanding theoretical results under various technical assumptions. We hope that this survey will facilitate discussion on the open problems in this topic, and more importantly, shed light on future research directions.
翻译:随机特性是加快大规模问题内核方法最受欢迎的技术之一。 2017年NeurIPS Test-Time Fest-Priest Paper Finalist 2019年NeurIPS Test-Forum 奖和ICML Best Paper Finaltics 2019年ICML 2019年ICML Best Paper Finaltics 都承认了相关作品。 随机特性的整套工作迅速发展,因此,最好能对这一专题有一个全面的概览,解释各种算法和理论结果之间的联系。 在本次调查中,我们系统地审查代表性随机特性和基于随机特性的算法的动机、特点和贡献,并根据其抽样计划、学习程序、差异减少特性和它们如何利用培训数据。 其次,我们审查围绕以下关键问题的理论结果:为确保高近似近似质量或无损于经验估计值的测算者的经验/预期风险,需要多少随机特性。 第三,我们根据一些大型基准数据集对流行的算法进行全面评估,并讨论其准确质量和预测性表现。 最后,我们讨论了随机特征和现代过量的深度讨论过深层讨论深度讨论深度讨论的深度讨论的深度讨论过程讨论过程讨论结果之间的关系。 将在本级测算的理论分析中,作为本理论分析的模型的模型的模型的模型的模型分析,作为目前的测测算的模型的模型分析,作为结果,作为结果,作为高级的理论学的理论学的理论学的模型的模型的模型的模型的模型分析,作为目前的模型,作为结果,作为整个的理论学的理论学的理论学的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的原理的原理的模型的模型的模型的模型的模型的模型的原理的运用的利用。