Computing the gradient of a function provides fundamental information about its behavior. This information is essential for several applications and algorithms across various fields. One common application that require gradients are optimization techniques such as stochastic gradient descent, Newton's method and trust region methods. However, these methods usually requires a numerical computation of the gradient at every iteration of the method which is prone to numerical errors. We propose a simple limited-memory technique for improving the accuracy of a numerically computed gradient in this gradient-based optimization framework by exploiting (1) a coordinate transformation of the gradient and (2) the history of previously taken descent directions. The method is verified empirically by extensive experimentation on both test functions and on real data applications. The proposed method is implemented in the R package smartGrad and in C++.
翻译:计算函数的梯度可以提供有关其行为的基本信息。 此信息对于多个应用和不同字段的算法至关重要。 需要梯度的一个常见应用是优化技术, 如随机梯度梯度下移、 牛顿的方法和信任区域的方法。 但是, 这些方法通常需要在容易发生数字错误的方法的每次迭代时计算梯度。 我们提出了一个简单的有限模拟技术, 用于提高这个基于梯度的优化框架中数字计算梯度的准确性, 利用 (1) 协调梯度的转换和 (2) 先前的下降方向历史。 该方法通过测试函数和真实数据应用程序的广泛实验得到验证。 拟议方法在 R 软件包智能格德和 C++ 中实施。