We consider distributed stochastic variational inequalities (VIs) on unbounded domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that, in particular, covers the settings of fully decentralized calculations with time-varying networks and centralized topologies commonly used in Federated Learning. Moreover, multiple local updates on the workers can be made for reducing the communication frequency between the workers. We extend the stochastic extragradient method to this very general setting and theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone (when a Minty solution exists) settings. The provided rates explicitly exhibit the dependence on network characteristics (e.g., mixing time), iteration counter, data heterogeneity, variance, number of devices, and other standard parameters. As a special case, our method and analysis apply to distributed stochastic saddle-point problems (SPP), e.g., to the training of Deep Generative Adversarial Networks (GANs) for which decentralized training has been reported to be extremely challenging. In experiments for the decentralized training of GANs we demonstrate the effectiveness of our proposed approach.
翻译:我们考虑非平凡区域上的分布式随机变分不等式,在该问题数据是异质的(非 IID),并分布在许多设备上。我们对计算网络做了一个非常普遍的假设,包括时间变化的网络和在联邦学习中常用的集中式拓扑结构。此外,可以在工作节点上进行多次本地更新,以减少工人之间的通信频率。我们将随机外推法扩展到这个非常普遍的设置中,并理论分析其在强单调、单调和非单调情况下的收敛速率(当Minty解存在时)。所提供的速率明确展示了网络特征(例如混合时间)、迭代计数器、数据异质性、方差、设备数量和其他标准参数的依赖关系。作为一个特例,我们的方法和分析适用于分布式随机鞍点问题(SPP),例如用于训练分布式生成对抗网络(GANs),其中分布式训练报导称极具挑战性。在GANs分布式训练的实验中,我们证明了我们提出的方法的有效性。