The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In nonparametric regression, one assumes that the regression function belongs to a pre-specified infinite-dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to iteratively update an estimate rather than refitting an entire model repeatedly. Inspired by nonparametric sieve estimation and stochastic approximation methods, we propose a sieve stochastic gradient descent estimator (Sieve-SGD) when the hypothesis space is a Sobolev ellipsoid. We show that Sieve-SGD has rate-optimal mean squared error (MSE) under a set of simple and direct conditions. The proposed estimator can be constructed with a low computational (time and space) expense: We also formally show that Sieve-SGD requires almost minimal memory usage among all statistically rate-optimal estimators.
翻译:回归的目标是恢复一个未知的基本功能, 将一组预测器与噪音观测的结果最佳地联系起来。 在非参数回归中, 假设回归函数属于一个预设的无限功能空间( 假设空间 ) 。 在在线设置中, 当观测进入一条流中时, 在计算上可以反复更新估计值, 而不是反复重新配置整个模型。 在非参数性筛选估计和随机近似方法的启发下, 当假设空间为 Sobolev ELLO 时, 我们建议使用一个精细的梯度梯度下沉估计仪( 筛选- SGD ) 。 我们显示, Sieve- SGD 在一系列简单直接的条件下, 具有比率- 最佳平均正方形错误( MSE ) 。 拟议的估算器可以用低的计算( 时间和 空间) 成本构建 : 我们还正式显示, Sieve- SGD 需要在所有统计性速率- 最佳估计器中几乎最低限度的记忆使用 。