准牛顿机器学习方法:忘记过去,仅仅抽样 (Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample)

We present two sampled quasi-Newton methods (sampled LBFGS and sampled LSR1) for solving empirical risk minimization problems that arise in machine learning. Contrary to the classical variants of these methods that sequentially build Hessian or inverse Hessian approximations as the optimization progresses, our proposed methods sample points randomly around the current iterate at every iteration to produce these approximations. As a result, the approximations constructed make use of more reliable (recent and local) information, and do not depend on past iterate information that could be significantly stale. Our proposed algorithms are efficient in terms of accessed data points (epochs) and have enough concurrency to take advantage of parallel/distributed computing environments. We provide convergence guarantees for our proposed methods. Numerical tests on a toy classification problem as well as on popular benchmarking binary classification and neural network training tasks reveal that the methods outperform their classical variants.

翻译：我们提出了两种抽样的准纽顿方法(抽样LBFGS和抽样LSR1),以解决机器学习中出现的尽量减少风险的经验性问题。与这些方法的古典变体相反,这些方法随着优化的进展而依次构建赫森或赫森近似值,我们建议的方法抽样点随机围绕当前循环的每个迭代产生这些近似值。因此,所构建的近似值利用了更可靠的(最近和当地)信息,而并不取决于以往的反复信息,而这种信息可能非常陈旧。我们提议的算法在访问数据点(时代)方面是有效的,并且有足够的共通货币来利用平行/分散的计算环境。我们为我们拟议的方法提供了趋同保证。对一个小类分类问题以及流行的基准二元分类和神经网络培训任务进行数值测试表明,这些方法超越了它们的典型变体。

相关内容

拟牛顿法

关注 1

拟牛顿法(Quasi-Newton Methods)是求解非线性优化问题最有效的方法之一，于20世纪50年代由美国Argonne国家实验室的物理学家W. C. Davidon所提出来。Davidon设计的这种算法在当时看来是非线性优化领域最具创造性的发明之一。不久R. Fletcher和M. J. D. Powell证实了这种新的算法远比其他方法快速和可靠，使得非线性优化这门学科在一夜之间突飞猛进。

专知会员服务

170+阅读 · 2020年5月10日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

117+阅读 · 2020年3月25日