The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In non-parametric regression, one assumes that the regression function belongs to a pre-specified infinite dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to iteratively update an estimate rather than refitting an entire model repeatedly. Inspired by nonparametric sieve estimation and stochastic approximation methods, we propose a sieve stochastic gradient descent estimator (Sieve-SGD) when the hypothesis space is a Sobolev ellipsoid. We show that Sieve-SGD has rate-optimal MSE under a set of simple and direct conditions. We also show that the Sieve-SGD estimator can be constructed with low time expense, and requires almost minimal memory usage among all statistically rate-optimal estimators, under some conditions on the distribution of the predictors.
翻译:回归的目标是恢复一个未知的基本功能, 将一组预测器与噪音观测的结果最佳地联系起来。 在非参数回归中, 假设回归函数属于一个预设的无限功能空间( 假设空间 ) 。 在在线设置中, 当观测结果进入一条流中, 在计算上, 可以反复更新估计值, 而不是反复重新配置整个模型。 在非参数性筛选估计和随机近似方法的启发下, 当假设空间为 Sobolev ELLIP 时, 我们提议一个精密的梯度梯度下沉估计仪( Sieve- SGD) 。 我们显示, Sieve-SGD 在一系列简单直接的条件下, 有速率最佳的MSE 。 我们还表明, Sieve- SGD 估计器可以以低时间成本构建, 并且需要在所有统计性速率最佳估计器中几乎最低限度的记忆使用, 在预测器分布的某些条件下 。