In this paper, we propose to solve a regularized distributionally robust learning problem in the decentralized setting, taking into account the data distribution shift. By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust minimization problem and solved efficiently. Leveraging the newly formulated optimization problem, we propose a robust version of Decentralized Stochastic Gradient Descent (DSGD), coined Distributionally Robust Decentralized Stochastic Gradient Descent (DR-DSGD). Under some mild assumptions and provided that the regularization parameter is larger than one, we theoretically prove that DR-DSGD achieves a convergence rate of $\mathcal{O}\left(1/\sqrt{KT} + K/T\right)$, where $K$ is the number of devices and $T$ is the number of iterations. Simulation results show that our proposed algorithm can improve the worst distribution test accuracy by up to $10\%$. Moreover, DR-DSGD is more communication-efficient than DSGD since it requires fewer communication rounds (up to $20$ times less) to achieve the same worst distribution test accuracy target. Furthermore, the conducted experiments reveal that DR-DSGD results in a fairer performance across devices in terms of test accuracy.
翻译:在本文中,我们建议解决分散环境中的正常分配强健的学习问题,同时考虑到数据分布的变化。如果在硬性微最大优化问题中添加了 Kullback-Liebler 正规化功能,那么学习问题可以降为修改稳健的最小化问题,并有效解决。我们利用新开发的优化问题,提出一个稳健的版本的分散式软粒子源(DSGD),即同步的分散式分散式分散式分散式软体渐变源(DSGD ) (DDR-DSGD) 。根据一些温和假设,并且如果正规化参数大于一个参数,我们理论上证明DR-DSGD 能够达到$\mathcal{O ⁇ left{O ⁇ qleft{(1/\qrt{KT} + K/T\right) $的趋同率。我们提议的算法可以提高最差的分发测试精确度,达到10美元。此外,DDGDDD比最差的精确度测试要少20倍。