不同适应性分批量选择战略的等同性,用于随机梯度梯度梯度梯度下降方法 (On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods)

In this study, we demonstrate that the norm test and inner product/orthogonality test presented in \cite{Bol18} are equivalent in terms of the convergence rates associated with Stochastic Gradient Descent (SGD) methods if $\epsilon^2=\theta^2+\nu^2$ with specific choices of $\theta$ and $\nu$. Here, $\epsilon$ controls the relative statistical error of the norm of the gradient while $\theta$ and $\nu$ control the relative statistical error of the gradient in the direction of the gradient and in the direction orthogonal to the gradient, respectively. Furthermore, we demonstrate that the inner product/orthogonality test can be as inexpensive as the norm test in the best case scenario if $\theta$ and $\nu$ are optimally selected, but the inner product/orthogonality test will never be more computationally affordable than the norm test if $\epsilon^2=\theta^2+\nu^2$. Finally, we present two stochastic optimization problems to illustrate our results.

翻译：在本研究中,我们证明在\ cite{Bol18} 中提出的标准测试和内产物/内产物/内产物/内产物/内产物/内产物测试,如果具体选择为$\theta$和$nu$,则与Stochacistic Gradient Fround (SGD) 方法相关的趋同率相等,如果$\psilon=2 ⁇ 2 ⁇ 2 ⁇ 2 ⁇ Nu ⁇ 2$,则该标准测试和内产物/内产物/内产物/内产物测试具有具体选择为$\theta$和$nnu$的具体选择为$和美元。在这里,美元控制着该标准值的相对统计错误,而美元和美元控制着梯度方向和方向或向向梯度的梯度的梯度的相对统计错误。此外,我们证明,如果以美元和美元为最佳选择,则内产物/内产物/内产物检验标准将比标准测试更廉价,如果以美元计算出2 ⁇ 2 ⁇ 2 ⁇ 2 ⁇ Nu ⁇ 2$的话,那么。最后,我们提出了两个标准优化问题来说明我们的结果。