The minimax optimization over Riemannian manifolds (possibly nonconvex constraints) has been actively applied to solve many problems, such as robust dimensionality reduction and deep neural networks with orthogonal weights (Stiefel manifold). Although many optimization algorithms for minimax problems have been developed in the Euclidean setting, it is difficult to convert them into Riemannian cases, and algorithms for nonconvex minimax problems with nonconvex constraints are even rare. On the other hand, to address the big data challenges, decentralized (serverless) training techniques have recently been emerging since they can reduce communications overhead and avoid the bottleneck problem on the server node. Nonetheless, the algorithm for decentralized Riemannian minimax problems has not been studied. In this paper, we study the distributed nonconvex-strongly-concave minimax optimization problem over the Stiefel manifold and propose both deterministic and stochastic minimax methods. The local model is non-convex strong-concave and the Steifel manifold is a non-convex set. The global function is represented as the finite sum of local functions. For the deterministic setting, we propose DRGDA and prove that our deterministic method achieves a gradient complexity of $O( \epsilon^{-2})$ under mild conditions. For the stochastic setting, we propose DRSGDA and prove that our stochastic method achieves a gradient complexity of $O(\epsilon^{-4})$. The DRGDA and DRSGDA are the first algorithms for distributed minimax optimization with nonconvex constraints with exact convergence. Extensive experimental results on the Deep Neural Networks (DNNs) training over the Stiefel manifold demonstrate the efficiency of our algorithms.
翻译:里格曼式元件( 可能非convex 限制) 的迷你最大优化已被积极应用于解决许多问题, 比如强力的维度减低和高神经网络( Stiefel 多重) 。 虽然在欧克利德式设置中已经开发了许多针对迷你最大问题的优化算法, 但很难将其转换成里格曼式案例, 而在非convex 限制下, 非convex 迷你最大问题的算法甚至甚至很少。 另一方面, 为了应对大数据挑战, 分散( 无服务器) 培训技术最近才开始出现, 因为它们可以减少通信的顶端, 避免服务器节点( Stiefel 多重) 上的瓶颈问题。 尽管如此, 还没有研究分散的里格曼尼特小麦问题的优化算法。 在本文中, 我们研究分布的非convex- concol- concental Drex- dentrix Dral- dregistration 方法, 以非col- condeal- deal- defral- deal- dal- deal- dal- defal- defal- devial- defal- destral- defol- dal- demax- destral- destral- destral- degal- destrisal- dal- decuments mas mas mas las 立 和 立 立 立 立 立 度 立 度 度 立下, 我们立地算法。