The stochastic proximal point (SPP) methods have gained recent attention for stochastic optimization, with strong convergence guarantees and superior robustness to the classic stochastic gradient descent (SGD) methods showcased at little to no cost of computational overhead added. In this article, we study a minibatch variant of SPP, namely M-SPP, for solving convex composite risk minimization problems. The core contribution is a set of novel excess risk bounds of M-SPP derived through the lens of algorithmic stability theory. Particularly under smoothness and quadratic growth conditions, we show that M-SPP with minibatch-size $n$ and iteration count $T$ enjoys an in-expectation fast rate of convergence consisting of an $\mathcal{O}\left(\frac{1}{T^2}\right)$ bias decaying term and an $\mathcal{O}\left(\frac{1}{nT}\right)$ variance decaying term. In the small-$n$-large-$T$ setting, this result substantially improves the best known results of SPP-type approaches by revealing the impact of noise level of model on convergence rate. In the complementary small-$T$-large-$n$ regime, we provide a two-phase extension of M-SPP to achieve comparable convergence rates. Moreover, we derive a near-tight high probability (over the randomness of data) bound on the parameter estimation error of a sampling-without-replacement variant of M-SPP. Numerical evidences are provided to support our theoretical predictions when substantialized to Lasso and logistic regression models.
翻译:随机准点(SPP)方法最近引起了对随机优化的注意,在光滑和二次增长条件下,我们展示了对传统的随机梯度梯度下降(SGD)方法的强烈趋同保证和高度稳健,以极少甚至不增加计算间接费用成本的方式展示了这种方法。在本篇文章中,我们研究了SPP的迷你批量变异,即M-SPP,以解决 convex复合风险最小化问题。核心贡献是一组通过算法稳定性理论的视角产生的M-SPP的新的超风险界限。特别是在平滑和二次增长条件下,我们显示,具有微型相配值的精度梯度梯度下降(SGD)方法的MPPP具有超强的超常值值值,这是我们目前最接近的精确度变异性指数的快速趋同率。