We consider a standard federated learning architecture where a group of clients periodically coordinate with a central server to train a statistical model. We tackle two major challenges in federated learning: (i) objective heterogeneity, which stems from differences in the clients' local loss functions, and (ii) systems heterogeneity, which leads to slow and straggling client devices. Due to such client heterogeneity, we show that existing federated learning algorithms suffer from a fundamental speed-accuracy conflict: they either guarantee linear convergence but to an incorrect point, or convergence to the global minimum but at a sub-linear rate, i.e., fast convergence comes at the expense of accuracy. To address the above limitation, we propose FedLin - a simple, new algorithm that exploits past gradients and employs client-specific learning rates. When the clients' local loss functions are smooth and strongly convex, we show that FedLin guarantees linear convergence to the global minimum. We then establish matching upper and lower bounds on the convergence rate of FedLin that highlight the trade-offs associated with infrequent, periodic communication. Notably, FedLin is the only approach that is able to match centralized convergence rates (up to constants) for smooth strongly convex, convex, and non-convex loss functions despite arbitrary objective and systems heterogeneity. We further show that FedLin preserves linear convergence rates under aggressive gradient sparsification, and quantify the effect of the compression level on the convergence rate.
翻译:我们考虑的是标准的联邦学习架构,在这个架构中,一组客户定期与中央服务器协调,以培训统计模式。我们应对了在联邦学习方面的两大挑战:(一) 目标异质性,因客户当地损失功能的差异而产生,以及(二) 系统异质性,导致客户设备缓慢和支离破碎。由于客户差异性,我们显示,现有的联邦学习算法存在基本的速度不精确冲突:它们要么保证线性趋同,但达到一个不正确的点,要么与全球最低水平趋同,但以亚线性比率为次线性,即快速趋同以牺牲准确性为代价。为了应对上述限制,我们建议FedLin-一种利用过去梯度和客户特定学习率的简单、新的算法。由于客户损失函数平滑和强烈的曲线,我们显示,FedLin的计算法保证了我们全球最低程度的线性趋同。我们随后将FedLin的趋同率与经常、定期和直线性趋同性比率相匹配,尽管他能够与经常、定期的货币趋同的联式的货币递合。