Due to the communication bottleneck in distributed and federated learning applications, algorithms using communication compression have attracted significant attention and are widely used in practice. Moreover, there exists client-variance in federated learning due to the total number of heterogeneous clients is usually very large and the server is unable to communicate with all clients in each communication round. In this paper, we address these two issues together by proposing compressed and client-variance reduced methods. Concretely, we introduce COFIG and FRECON, which successfully enjoy communication compression with client-variance reduction. The total communication round of COFIG is $O(\frac{(1+\omega)^{3/2}\sqrt{N}}{S\epsilon^2}+\frac{(1+\omega)N^{2/3}}{S\epsilon^2})$ in the nonconvex setting, where $N$ is the total number of clients, $S$ is the number of communicated clients in each round, $\epsilon$ is the convergence error, and $\omega$ is the parameter for the compression operator. Besides, our FRECON can converge faster than COFIG in the nonconvex setting, and it converges with $O(\frac{(1+\omega)\sqrt{N}}{S\epsilon^2})$ communication rounds. In the convex setting, COFIG converges within the communication rounds $O(\frac{(1+\omega)\sqrt{N}}{S\epsilon})$, which is also the first convergence result for compression schemes that do not communicate with all the clients in each round. In sum, both COFIG and FRECON do not need to communicate with all the clients and provide first/faster convergence results for convex and nonconvex federated learning, while previous works either require full clients communication (thus not practical) or obtain worse convergence results.
翻译:由于分布式和联合式学习应用程序中的通信瓶颈,使用通信压缩的算法引起了人们的极大关注,并在实践中得到广泛使用。此外,由于不同客户的总数通常非常大,服务器无法在每轮通信中与所有客户进行沟通。在本文中,我们通过提出压缩和客户差异减少的方法,共同解决这两个问题。具体地说,我们引入COFIG和FRECON,它们成功地享受客户差异减少的通信压缩。COFIG的总通信回合是$(frac{(1 ⁇ 2)%3/2 ⁇ sqrt{N ⁇ S\\epsilon_2 ⁇ frac{(1\\\\gomega)%2{frc{(1\\\\\\\\\\\grqfrac})), 客户的通信周期不是GFIFIx,而是将所有交易周期(美元)升级。此外,我们的FIFIx客户可以快速地告知(xx)所有通信结果。