Decentralized learning algorithms empower interconnected edge devices to share data and computational resources to collaboratively train a machine learning model without the aid of a central coordinator (e.g. an orchestrating basestation). In the case of heterogeneous data distributions at the network devices, collaboration can yield predictors with unsatisfactory performance for a subset of the devices. For this reason, in this work we consider the formulation of a distributionally robust decentralized learning task and we propose a decentralized single loop gradient descent/ascent algorithm (AD-GDA) to solve the underlying minimax optimization problem. We render our algorithm communication efficient by employing a compressed consensus scheme and we provide convergence guarantees for smooth convex and non-convex loss functions. Finally, we corroborate the theoretical findings with empirical evidence of the ability of the proposed algorithm in providing unbiased predictors over a network of collaborating devices with highly heterogeneous data distributions.
翻译:分散式学习算法授权相互关联的边缘设备分享数据和计算资源,以便在没有中央协调员帮助的情况下合作培训机器学习模型(例如,一个管弦化基础站)。在网络设备的不同数据分布方面,合作可以产生功能不尽的预测器,用于部分设备。因此,在这项工作中,我们考虑制定分流式强的分散式学习任务,并提出一个分散式的单一循环梯级下行/加速式算法(AD-GDA),以解决最微缩式优化问题。我们采用压缩的共识方案,使算法通信效率高,并为平稳的螺旋形和非电流损失功能提供趋同保证。最后,我们用经验证据来证实理论结论,证明拟议的算法能够在高度分散式数据分布的协作设备网络上提供不偏不倚的预测器。