Minimax optimization problems have attracted significant attention in recent years due to their widespread application in numerous machine learning models. To solve the minimax optimization problem, a wide variety of stochastic optimization methods have been proposed. However, most of them ignore the distributed setting where the training data is distributed on multiple workers. In this paper, we developed a novel decentralized stochastic gradient descent ascent method for the finite-sum minimax optimization problem. In particular, by employing the variance-reduced gradient, our method can achieve $O(\frac{\sqrt{n}\kappa^3}{(1-\lambda)^2\epsilon^2})$ sample complexity and $O(\frac{\kappa^3}{(1-\lambda)^2\epsilon^2})$ communication complexity for the nonconvex-strongly-concave minimax optimization problem. As far as we know, our work is the first one to achieve such theoretical complexities for this kind of problem. At last, we apply our method to optimize the AUC maximization problem and the experimental results confirm the effectiveness of our method.
翻译:近年来,由于在很多机器学习模型中广泛应用了微缩成像优化问题,微缩成像优化问题引起了人们的极大关注。为了解决微缩成像优化问题,提出了各种各样的随机优化方法。然而,大多数这类方法忽略了在多个工人中分发培训数据分布的分布环境。在本文中,我们为有限和微缩成像优化问题开发了一种新的分散式随机梯度下降法。特别是,通过使用差异式降低的梯度,我们的方法可以达到美元(o)(frac){sqrt{n ⁇ kappa}3 ⁇ 3((1-\lambda)2\epsilon}($)样本复杂性和美元(o(frac_kappa}3 ⁇ 3 ⁇ (1-\lambda)})2\epsilon2}($)的通信复杂性。我们知道,我们的工作是第一个实现这类问题理论复杂性的方法。最后,我们运用了我们的方法优化AUCE最大化问题和实验结果的有效性的方法。