This paper focuses on stochastic methods for solving smooth non-convex strongly-concave min-max problems, which have received increasing attention due to their potential applications in deep learning (e.g., deep AUC maximization, distributionally robust optimization). However, most of the existing algorithms are slow in practice, and their analysis revolves around the convergence to a nearly stationary point.We consider leveraging the Polyak-Lojasiewicz (PL) condition to design faster stochastic algorithms with stronger convergence guarantee. Although PL condition has been utilized for designing many stochastic minimization algorithms, their applications for non-convex min-max optimization remain rare. In this paper, we propose and analyze a generic framework of proximal stage-based method with many well-known stochastic updates embeddable. Fast convergence is established in terms of both the primal objective gap and the duality gap. Compared with existing studies, (i) our analysis is based on a novel Lyapunov function consisting of the primal objective gap and the duality gap of a regularized function, and (ii) the results are more comprehensive with improved rates that have better dependence on the condition number under different assumptions. We also conduct deep and non-deep learning experiments to verify the effectiveness of our methods.
翻译:带 PL 条件的非凸强凸 min-max 问题中快速目标和对偶间隙收敛
本文聚焦于解决平滑非凸强凸 min-max 问题的随机方法,此类问题由于在深度学习中具有潜在的应用(如深度 AUC 最大化、分布鲁棒优化)而引起了越来越多的关注。然而,大多数现有算法在实践中较慢,其分析也主要集中在收敛到几乎稳定点上。我们考虑利用 Polyak-Lojasiewicz(PL)条件设计更快的随机算法,并得到更强的收敛保证。虽然 PL 条件已用于设计了许多随机最小化算法,但它们在非凸 min-max 优化中的应用仍相对较少。在本文中,我们提出并分析了一个基于 proximal stage-based 方法的通用框架,其中嵌入了许多众所周知的随机更新。目标函数和对偶间隙都可以得到收敛。与现有研究相比,我们的分析基于一种包含正则化函数的原始目标差和对偶差的新型 Lyapunov 函数,且结果更为全面,在不同假设下具有更好的条件数依赖性。我们还进行了深度学习和非深度学习实验来验证我们方法的有效性。