We give a description of the high-dimensional limit of one-pass single-batch stochastic gradient descent (SGD) on a least squares problem. This limit is taken with non-vanishing step-size, and with proportionally related number of samples to problem-dimensionality. The limit is described in terms of a stochastic differential equation in high dimensions, which is shown to approximate the state evolution of SGD. As a corollary, the statistical risk is shown to be approximated by the solution of a convolution-type Volterra equation with vanishing errors as dimensionality tends to infinity. The sense of convergence is the weakest that shows that statistical risks of the two processes coincide. This is distinguished from existing analyses by the type of high-dimensional limit given as well as generality of the covariance structure of the samples.
翻译:我们提供了一次单批次随机梯度下降(SGD)在最小二乘问题上的高维极限描述。该极限以非零步长进行,且样本数与问题维数成比例。该极限用高维随机微分方程描述,并显示为SGD状态演化的逼近。作为推论,统计风险显示近似于一种卷积型Volterra方程的解,其误差随着维数趋近于无穷而消失。收敛的意义是表明两个过程的统计风险相应地一致。这种高维极限的类型和样本协方差结构的普遍性使其区别于现有的分析。