To explore the limits of a stochastic gradient method, it may be useful to consider an example consisting of an infinite number of quadratic functions. In this context, it is appropriate to determine the expected value and the covariance matrix of the stochastic noise, i.e. the difference of the true gradient and the approximated gradient generated from a finite sample. When specifying the covariance matrix, the expected value of a quadratic form QBQ is needed, where Q is a Wishart distributed random matrix and B is an arbitrary fixed symmetric matrix. After deriving an expression for E(QBQ) and considering some special cases, a numerical example is used to show how these results can support the comparison of two stochastic methods.
翻译:为了探索随机梯度方法的局限性,不妨考虑一个由无限量的二次函数组成的示例。在这方面,宜确定随机噪音的预期值和共变量矩阵,即真实梯度的差异和从有限样本中产生的近似梯度。在指定共变量矩阵时,需要四方形QBQ的预期值,其中Q是Wishart分布的随机矩阵,B是任意固定的对称矩阵。在为E(QBQ)作出表达并审议一些特殊情况后,使用一个数字示例来说明这些结果如何支持两种随机方法的比较。