Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic differentiation algorithms that also includes the forward mode. We present a method to compute gradients based solely on the directional derivative that one can compute exactly and efficiently via the forward mode. We call this formulation the forward gradient, an unbiased estimate of the gradient that can be evaluated in a single forward run of the function, entirely eliminating the need for backpropagation in gradient descent. We demonstrate forward gradient descent in a range of problems, showing substantial savings in computation and enabling training up to twice as fast in some cases.
翻译:利用反向调整计算优化客观函数的梯度仍然是机器学习的支柱。反向调整或反向模式区分是自动区分算法这一整体体系中的一个特例,其中也包括前方模式。我们提出了一个计算梯度的方法,它完全基于通过前方模式可以准确和有效计算的方向衍生物。我们称这一公式为前方梯度,这是对梯度的不偏向估计,可以在函数的单个前方运行中加以评估,完全消除了对梯度下降的反向调整需要。我们在一系列问题上显示了前向梯度下降,在计算方面节省了大量资金,在某些情况下,培训速度可高达两倍。