This paper presents AGGLIO (Accelerated Graduated Generalized LInear-model Optimization), a stage-wise, graduated optimization technique that offers global convergence guarantees for non-convex optimization problems whose objectives offer only local convexity and may fail to be even quasi-convex at a global scale. In particular, this includes learning problems that utilize popular activation functions such as sigmoid, softplus and SiLU that yield non-convex training objectives. AGGLIO can be readily implemented using point as well as mini-batch SGD updates and offers provable convergence to the global optimum in general conditions. In experiments, AGGLIO outperformed several recently proposed optimization techniques for non-convex and locally convex objectives in terms of convergence rate as well as convergent accuracy. AGGLIO relies on a graduation technique for generalized linear models, as well as a novel proof strategy, both of which may be of independent interest.
翻译:本文介绍了AGGLIO(加速完成通用LInear模型优化),这是一种分阶段的、升级的优化技术,它为非convex优化问题提供了全球趋同保障,其目标只提供局部精度,可能甚至不能在全球范围成为准精度。特别是,这包括学习问题,它利用诸如Sigmoid、软+和Silu等大众激活功能,产生非精度培训目标。AGGLIO可以使用点和微型批量SGD更新,在一般条件下提供可与全球最佳趋同的可实现趋同。在实验中,AGGLIO在趋同率和本地凝点目标方面比最近提出的几项优化技术要好,在趋同率和趋同性精度方面也比起来。 AGGLIO依靠通用线模型的毕业技术,以及新的证据战略,这两种战略都可能具有独立的兴趣。