We present a new algorithm for automatically bounding the Taylor remainder series. In the special case of a scalar function $f: \mathbb{R} \mapsto \mathbb{R}$, our algorithm takes as input a reference point $x_0$, trust region $[a, b]$, and integer $k \ge 0$, and returns an interval $I$ such that $f(x) - \sum_{i=0}^k \frac {f^{(i)}(x_0)} {i!} (x - x_0)^i \in I (x - x_0)^{k+1}$ for all $x \in [a, b]$. As in automatic differentiation, the function $f$ is provided to the algorithm in symbolic form, and must be composed of known elementary functions. At a high level, our algorithm has two steps. First, for a variety of commonly-used elementary functions (e.g., $\exp$, $\log$), we derive sharp polynomial upper and lower bounds on the Taylor remainder series. We then recursively combine the bounds for the elementary functions using an interval arithmetic variant of Taylor-mode automatic differentiation. Our algorithm can make efficient use of machine learning hardware accelerators, and we provide an open source implementation in JAX. We then turn our attention to applications. Most notably, we use our new machinery to create the first universal majorization-minimization optimization algorithms: algorithms that iteratively minimize an arbitrary loss using a majorizer that is derived automatically, rather than by hand. Applied to machine learning, this leads to architecture-specific optimizers for training deep networks that converge from any starting point, without hyperparameter tuning. Our experiments show that for some optimization problems, these hyperparameter-free optimizers outperform tuned versions of gradient descent, Adam, and AdaGrad. We also show that our automatically-derived bounds can be used for verified global optimization and numerical integration, and to prove sharper versions of Jensen's inequality.
翻译:我们为自动绑定泰勒剩余值序列提供了一个新的算法 。 在 calar 函数的特例 $f :\ mathbb{R}\ mappsto\ mathb{R}$, 我们的算法将所有 $x_0 $, 信任区域 $a, b$, 整数 $k\ge 0 美元, 并返回一个间隔 $I, 这样, $( x) -\ sum_ i=0\ flac {f} (x_0)}} (i) ) (x - x_0) ) like lider 函数 : (xx - x_0) k+1} mapsto falbtoodeal) 。 在自动分解中, 美元函数以象征形式提供给算算算算器, 并且由已知的基本函数 。 在高水平上, 我们的算算算法有两个步骤 。 首先, 一种常用的直位化函数, 我们用直径直径解的直径解 。