Smoothing splines have been used pervasively in nonparametric regressions. However, the computational burden of smoothing splines is significant when the sample size $n$ is large. When the number of predictors $d\geq2$, the computational cost for smoothing splines is at the order of $O(n^3)$ using the standard approach. Many methods have been developed to approximate smoothing spline estimators by using $q$ basis functions instead of $n$ ones, resulting in a computational cost of the order $O(nq^2)$. These methods are called the basis selection methods. Despite algorithmic benefits, most of the basis selection methods require the assumption that the sample is uniformly-distributed on a hyper-cube. These methods may have deteriorating performance when such an assumption is not met. To overcome the obstacle, we develop an efficient algorithm that is adaptive to the unknown probability density function of the predictors. Theoretically, we show the proposed estimator has the same convergence rate as the full-basis estimator when $q$ is roughly at the order of $O[n^{2d/\{(pr+1)(d+2)\}}\quad]$, where $p\in[1,2]$ and $r\approx 4$ are some constants depend on the type of the spline. Numerical studies on various synthetic datasets demonstrate the superior performance of the proposed estimator in comparison with mainstream competitors.
 翻译:平滑的样条已被广泛用于非参数回归中。 但是, 平滑的样条的计算负担在样本规模为$n美元的情况下是巨大的。 当样本数量为$d\ge2美元时, 平滑的样条的计算成本使用标准方法, 顺滑的样条的计算成本大约为$O( n=3)美元。 许多方法已经开发出来, 通过使用$q( $) 基函数而不是$( $) 来接近平滑的样条估计器。 这些方法被称为基础选择方法。 尽管有算法上的好处, 大多数基础选择方法要求假设在超立方上统一分配样本。 这些方法在不满足这种假设时可能表现恶化。 为了克服障碍, 我们开发了一种高效的算法, 适应了未知的预测器的概率密度功能。 从理论上讲, 我们所拟议的估计的比重率与完全测算器相同, 当 $( $q%_ 美元) 和 美元 美元 正在显示的高级数据序列排序 [ n\\\\\\ s s s s sqr= s sq s s sqal tyal tyal ty typeal typeal typeal 。