As a prevalent distributed learning paradigm, Federated Learning (FL) trains a global model on a massive amount of devices with infrequent communication. This paper investigates a class of composite optimization and statistical recovery problems in the FL setting, whose loss function consists of a data-dependent smooth loss and a non-smooth regularizer. Examples include sparse linear regression using Lasso, low-rank matrix recovery using nuclear norm regularization, etc. In the existing literature, federated composite optimization algorithms are designed only from an optimization perspective without any statistical guarantees. In addition, they do not consider commonly used (restricted) strong convexity in statistical recovery problems. We advance the frontiers of this problem from both optimization and statistical perspectives. From optimization upfront, we propose a new algorithm named \textit{Fast Federated Dual Averaging} for strongly convex and smooth loss and establish state-of-the-art iteration and communication complexity in the composite setting. In particular, we prove that it enjoys a fast rate, linear speedup, and reduced communication rounds. From statistical upfront, for restricted strongly convex and smooth loss, we design another algorithm, namely \textit{Multi-stage Federated Dual Averaging}, and prove a high probability complexity bound with linear speedup up to optimal statistical precision. Experiments in both synthetic and real data demonstrate that our methods perform better than other baselines. To the best of our knowledge, this is the first work providing fast optimization algorithms and statistical recovery guarantees for composite problems in FL.
翻译:作为普遍的分布式学习模式,Federal Learning(FL)在大量设备上培养了一个全球模型,不经常交流。本文调查了FL环境中的一组综合优化和统计恢复问题,其损失功能包括依赖数据的平稳损失和非抽吸调节器。例子包括使用Lasso的线性回归、使用核规范规范规范化的低级矩阵恢复等。在现有文献中,联合复合优化算法仅从优化角度设计,没有任何统计保障。此外,它们并不认为统计恢复问题中常用(限制性)强的复杂度。我们从优化和统计角度推进这一问题的前沿。我们从优化前端,我们提出了名为\textit{Fread Extrade Averability} 的新算法,以强烈的混凝固和光滑滑化的方式进行恢复。在综合环境中,联合组合组合的复合优化组合优化和通信算算法,特别是我们拥有快速恢复率、线性加速和减少通信周期。从统计前端,以限制强烈的混凝度和平稳损失,我们设计了另一个名为Fliral assimalalalalalalalalal dalislation 。我们用最精确的计算法,以展示了另一种最精确的统计方法,以展示了我们最优化的精确的精确和最精确的统计方法。