Stochastic gradient descent (SGD) is a scalable and memory-efficient optimization algorithm for large datasets and stream data, which has drawn a great deal of attention and popularity. The applications of SGD-based estimators to statistical inference such as interval estimation have also achieved great success. However, most of the related works are based on i.i.d. observations or Markov chains. When the observations come from a mixing time series, how to conduct valid statistical inference remains unexplored. As a matter of fact, the general correlation among observations imposes a challenge on interval estimation. Most existing methods may ignore this correlation and lead to invalid confidence intervals. In this paper, we propose a mini-batch SGD estimator for statistical inference when the data is $\phi$-mixing. The confidence intervals are constructed using an associated mini-batch bootstrap SGD procedure. Using ``independent block'' trick from \cite{yu1994rates}, we show that the proposed estimator is asymptotically normal, and its limiting distribution can be effectively approximated by the bootstrap procedure. The proposed method is memory-efficient and easy to implement in practice. Simulation studies on synthetic data and an application to a real-world dataset confirm our theory.
翻译:对大型数据集和流数据而言,基于 SGD 的测算器对诸如间隔估计等统计推论的应用也取得了巨大成功。然而,大多数相关作品都以i.i.d. 观察或Markov 链为基础。当观测来自混合时间序列时,如何进行有效的统计推论仍然无法探索。事实上,观测之间的一般相关性对间歇估计构成挑战。大多数现有方法可能忽略这一相关性,导致无效的信任间隔。在本文件中,我们提议在数据为$-phet-mixing时,对统计推论采用小型的SGD 测算器。当相关工作以i.i.d. 观测或Markov 链系为基础。当观测来自混合时间序列时,如何进行有效的统计推导出有效的统计推算。我们使用“依赖的区块”从\cite{yuma1994rates} 的骗术,我们发现拟议的测算器是正常的,并导致无效的信任间隔间隔期。在本文中,我们建议用一个小批量的SGD 度测算法来有效地确认我们的数据的精确度。一个模拟测算方法可以用来确认我们的数据。</s>