Several machine learning applications involve the optimization of higher-order derivatives (e.g., gradients of gradients) during training, which can be expensive in respect to memory and computation even with automatic differentiation. As a typical example in generative modeling, score matching (SM) involves the optimization of the trace of a Hessian. To improve computing efficiency, we rewrite the SM objective and its variants in terms of directional derivatives, and present a generic strategy to efficiently approximate any-order directional derivative with finite difference (FD). Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations. Thus, it reduces the total computational cost while also improving numerical stability. We provide two instantiations by reformulating variants of SM objectives into the FD forms. Empirically, we demonstrate that our methods produce results comparable to the gradient-based counterparts while being much more computationally efficient.
翻译:一些机器学习应用程序涉及在培训期间优化高排序衍生物(如梯度梯度),即使自动区分,在记忆和计算方面也可能费用昂贵。作为基因模型的典型例子,得分匹配(SM)涉及优化赫森人的痕迹。为了提高计算效率,我们改写SM目标及其变体的定向衍生物(FD),并提出一种通用战略,以有效近似任何有一定差异的指令衍生物(FD)。我们的近似只涉及职能评估,这种评估可以平行进行,而没有梯度计算。因此,它降低了计算总成本,同时也提高了数字稳定性。我们通过将SM目标的变种改制成FD形式提供了两种即时反应。我们很自然地证明,我们的方法产生与梯度对应方相当的结果,同时在计算上效率要高得多。