Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges to a normal distribution. However, this normal distribution has unknown asymptotic covariance; thus cannot be directly applied to measure the uncertainty. To fill this gap, we develop two estimators for the asymptotic covariance of ROOT-SGD. Our covariance estimators are useful for statistical inference in ROOT-SGD. Our first estimator adopts the idea of plug-in. For each unknown component in the formula of the asymptotic covariance, we substitute it with its empirical counterpart. The plug-in estimator converges at the rate $\mathcal{O}(1/\sqrt{t})$, where $t$ is the sample size. Despite its quick convergence, the plug-in estimator has the limitation that it relies on the Hessian of the loss function, which might be unavailable in some cases. Our second estimator is a Hessian-free estimator that overcomes the aforementioned limitation. The Hessian-free estimator uses the random-scaling technique, and we show that it is an asymptotically consistent estimator of the true covariance.
翻译:在线学习自然会在许多统计和机器学习问题中产生。 在线学习中最常用的方法是随机第一阶算法。 在这个算法组中, 最近开发了一种算法, 即 Rursive On- Out- Over SGD(ROOT- SGD-SGD) 。 ROOT- SGD 具有优势, 因为它以非随机速率聚合, 其估计值会进一步与正常分布相交。 然而, 这个正常的分布是未知的无止血调调调调和调和, 因此无法直接用于测量不确定性。 为了填补这一空白, 我们为ROOT- SGD 的随机调和调和调和调和调和差开发了两个估计值。 我们的第一个估和调和器采用了一个不固定的标比值, 其标定值的标定值可能以美元计比值为准。