Stochastic gradient descent (SGD) is an estimation tool for large data employed in machine learning and statistics. Due to the Markovian nature of the SGD process, inference is a challenging problem. An underlying asymptotic normality of the averaged SGD (ASGD) estimator allows for the construction of a batch-means estimator of the asymptotic covariance matrix. Instead of the usual increasing batch-size strategy employed in ASGD, we propose a memory efficient equal batch-size strategy and show that under mild conditions, the estimator is consistent. A key feature of the proposed batching technique is that it allows for bias-correction of the variance, at no cost to memory. Since joint inference for high dimensional problems may be undesirable, we present marginal-friendly simultaneous confidence intervals, and show through an example how covariance estimators of ASGD can be employed in improved predictions.
翻译:由于SGD过程的Markovian性质,推论是一个具有挑战性的问题。平均SGD(ASGD)估计值的正常性是一个基本因素,它允许建造一个零散平均SGD(ASGD)估计值的测算器(SSGD),用于计算机器学习和统计中使用的大型数据。我们建议采用一个记忆效率等量的批量规模战略,而不是ASGD通常使用的批量规模战略,并表明在温和条件下,估计值是一致的。拟议的批量技术的一个主要特点是,它允许对差异进行偏差纠正,而无需记忆费。由于对高维度问题的联合推论可能不可取,我们提出了边际友好的同期信任间隔,并通过实例表明如何在改进预测中使用ASGD的共性估计值。</s>