In this paper, we develop a quadrature framework for large-scale kernel machines via a numerical integration representation. Considering that the integration domain and measure of typical kernels, e.g., Gaussian kernels, arc-cosine kernels, are fully symmetric, we leverage deterministic fully symmetric interpolatory rules to efficiently compute quadrature nodes and associated weights for kernel approximation. The developed interpolatory rules are able to reduce the number of needed nodes while retaining a high approximation accuracy. Further, we randomize the above deterministic rules by the classical Monte-Carlo sampling and control variates techniques with two merits: 1) The proposed stochastic rules make the dimension of the feature mapping flexibly varying, such that we can control the discrepancy between the original and approximate kernels by tuning the dimnension. 2) Our stochastic rules have nice statistical properties of unbiasedness and variance reduction with fast convergence rate. In addition, we elucidate the relationship between our deterministic/stochastic interpolatory rules and current quadrature rules for kernel approximation, including the sparse grids quadrature and stochastic spherical-radial rules, thereby unifying these methods under our framework. Experimental results on several benchmark datasets show that our methods compare favorably with other representative kernel approximation based methods.
翻译:在本文中,我们通过数字集成代表制,为大型内核机器制定了一个象形框架。考虑到典型内核(例如高山内核、弧-cosine内核等)的整合领域和测量标准是完全对称的,我们利用完全对称的确定性完全对称的内核规则,以有效地对象形节点和内核近端的相关重量进行计算。发达的内插规则能够减少所需节点的数量,同时保持高近似精确度。此外,我们用传统的蒙特卡洛抽样和控制变异技术来随机调整上述确定性规则,有两种优点:(1) 拟议的调查规则使地貌图的尺寸变异,这样我们就能通过调和调调度来控制原始和近似内核内核内核内核内核之间的差异。(2) 我们的随机规则具有良好的统计特性,即以快速汇合率减少偏差和差异。此外,我们用传统的确定性/内核内核比较规则来随机比较上述确定性规则之间的关系,包括根据我们的一些基内核级的内核定的内核结构,以及现在的内核实验性框架的内核化的内核结构规则。