Since the late 1950's when quasi-Newton methods first appeared, they have become one of the most widely used and efficient algorithmic paradigms for unconstrained optimization. Despite their immense practical success, there is little theory that shows why these methods are so efficient. We provide a semi-local rate of convergence for the randomized BFGS method which can be significantly better than that of gradient descent, finally giving theoretical evidence supporting the superior empirical performance of the method.
翻译:自1950年代末开始出现准Newton方法以来,这些方法已成为最广泛使用和最高效的算法范式之一,用于不受限制的优化。 尽管这些方法取得了巨大的实际成功,但几乎没有什么理论可以说明这些方法为何如此有效。 我们为随机的BFGS方法提供了半本地的趋同率,该方法可能大大优于梯度下降方法,最后提供了理论证据支持这种方法的卓越经验性表现。