Kernel methods are ubiquitous in statistical modeling due to their theoretical guarantees as well as their competitive empirical performance. Polynomial kernels are of particular importance as their feature maps model the interactions between the dimensions of the input data. However, the construction time of explicit feature maps scales exponentially with the polynomial degree and a naive application of the kernel trick does not scale to large datasets. In this work, we propose Complex-to-Real (CtR) random features for polynomial kernels that leverage intermediate complex random projections and can yield kernel estimates with much lower variances than their real-valued analogs. The resulting features are real-valued, simple to construct and have the following advantages over the state-of-the-art: 1) shorter construction times, 2) lower kernel approximation errors for commonly used degrees, 3) they enable us to obtain a closed-form expression for their variance.
翻译:内核方法在统计模型方面无处不在,因为它们的理论保障和具有竞争性的经验性表现。多面内核具有特别重要的意义,因为它们的特征地图模拟了输入数据各层面之间的相互作用。然而,随着多面度指数指数化的清晰地图比例的构建时间和内核的天真应用并不对大型数据集造成影响。在这项工作中,我们提出了多面内核(CtR)随机特性,利用中间复杂的随机预测,得出内核估计值,其差异远小于其实际价值模拟值。由此产生的特征是真实价值,易于构建,并比最新工艺具有以下优势:1) 较短的构造时间,2) 普通使用的内核近似误差,3) 它们使我们能够获得对其差异的封闭式表达。