We study stopping rules for stochastic gradient descent (SGD) for convex optimization from the perspective of anytime-valid confidence sequences. Classical analyses of SGD provide convergence guarantees in expectation or at a fixed horizon, but offer no statistically valid way to assess, at an arbitrary time, how close the current iterate is to the optimum. We develop an anytime-valid, data-dependent upper confidence sequence for the weighted average suboptimality of projected SGD, constructed via nonnegative supermartingales and requiring no smoothness or strong convexity. This confidence sequence yields a simple stopping rule that is provably $\varepsilon$-optimal with probability at least $1-α$, with explicit bounds on the stopping time under standard stochastic approximation stepsizes. To the best of our knowledge, these are the first rigorous, time-uniform performance guarantees and finite-time $\varepsilon$-optimality certificates for projected SGD with general convex objectives, based solely on observable trajectory quantities.
翻译:本文从任意时间有效置信序列的视角研究凸优化中随机梯度下降(SGD)的停止准则。经典的SGD分析提供了期望意义下或固定迭代次数下的收敛保证,但无法在任意时刻以统计有效的方式评估当前迭代点与最优解的接近程度。我们通过非负上鞅方法,构造了一个针对投影SGD加权平均次优度的任意时间有效、数据依赖的置信上界序列,该构造无需光滑性或强凸性假设。此置信序列导出一个简单的停止准则,该准则以至少$1-α$的概率保证$\varepsilon$-最优性,并在标准随机逼近步长下给出了停止时间的显式界。据我们所知,这是首次针对一般凸目标的投影SGD,仅基于可观测的轨迹量,给出严格的时间一致性能保证与有限时间$\varepsilon$-最优性证明。