Stochastic gradient descent (SGD) has emerged as the quintessential method in a data scientist's toolbox. Much progress has been made in the last two decades toward understanding the iteration complexity of SGD (in expectation and high-probability) in the learning theory and optimization literature. However, using SGD for high-stakes applications requires careful quantification of the associated uncertainty. Toward that end, in this work, we establish high-dimensional Central Limit Theorems (CLTs) for linear functionals of online least-squares SGD iterates under a Gaussian design assumption. Our main result shows that a CLT holds even when the dimensionality is of order exponential in the number of iterations of the online SGD, thereby enabling high-dimensional inference with online SGD. Our proof technique involves leveraging Berry-Esseen bounds developed for martingale difference sequences and carefully evaluating the required moment and quadratic variation terms through recent advances in concentration inequalities for product random matrices. We also provide an online approach for estimating the variance appearing in the CLT (required for constructing confidence intervals in practice) and establish consistency results in the high-dimensional setting.
翻译:在数据科学家的工具箱中,蒸发性梯度下降(SGD)已成为典型的方法。在过去20年中,在理解学习理论和优化文献中SGD的迭代复杂性(预期和高概率)方面取得了很大进展。然而,使用SGD进行高吸量应用需要仔细量化相关的不确定性。为此,我们为在线最小方位的 SGD 代谢的线性功能,在高斯设计假设下,为在线最小方位的 SGD 代谢的线性功能建立了高维中央限制理论(CLTs )。我们的主要结果表明,即使在SGD 的代谢量数量呈指数指数变化时,CLT仍然保持着很大的进展,从而使得能够对在线SGD 进行高维度推算。我们的证据技术涉及利用为 Martingale 差异序列开发的Berry-Esseen界限,并通过最近产品随机矩阵中浓度不平等的进展,认真评估所需的时刻和二次变换条件。我们还提供了一种在线方法,用以估计CLT(为建立高度信任间隔确定和高度结果)中的差异。