Federated Learning (FL) is a decentralized machine learning architecture, which leverages a large number of remote devices to learn a joint model with distributed training data. However, the system-heterogeneity is one major challenge in a FL network to achieve robust distributed learning performance, which is of two aspects: i) device-heterogeneity due to the diverse computational capacity among devices; ii) data-heterogeneity due to the non-identically distributed data across the network. Though there have been benchmarks against the heterogeneous FL, e.g., FedProx, the prior studies lack formalization and it remains an open problem. In this work, we formalize the system-heterogeneous FL problem and propose a new algorithm, called FedLGA, which addresses this problem by bridging the divergence of local model updates via gradient approximation. To achieve this, FedLGA provides an alternated Hessian estimation method, which only requires extra linear complexity on the aggregator. Theoretically, we show that with a device-heterogeneous ratio $\rho$, FedLGA achieves convergence rates on non-i.i.d distributed FL training data against non-convex optimization problems for $\mathcal{O} \left( \frac{(1+\rho)}{\sqrt{ENT}} + \frac{1}{T} \right)$ and $\mathcal{O} \left( \frac{(1+\rho)\sqrt{E}}{\sqrt{TK}} + \frac{1}{T} \right)$ for full and partial device participation respectively, where $E$ is the number of local learning epoch, $T$ is the number of total communication round, $N$ is the total device number and $K$ is the number of selected device in one communication round under partially participation scheme. The results of comprehensive experiments on multiple datasets show that FedLGA outperforms current FL benchmarks against the system-heterogeneity.
翻译:联邦学习(FL) 是一个分散式的机器学习架构, 它利用大量远程设备来学习使用分布式培训数据的联合模型。 然而, 系统异质性是FL网络中实现稳健分布式学习表现的一大挑战, 它有两个方面 : (一) 设备异质性(由于设备之间不同的计算能力); (二) 数据异质性(由于整个网络的不明显分布数据) 。 尽管有针对多种异性FL的基准, 例如, Fed Prox, 先前的研究缺乏正式化, 并且仍然是一个开放式问题。 在这项工作中, 我们正式确定系统异质性FLGA问题, 提出一个新的算法, 通过渐变近化来弥合本地模式更新的差异; (二) FedLGA提供替代的海瑟估算方法, 只需在数字上增加线性复杂性。 (xxxxxxxxx) 元(lxxxxxxxxxxxxxxx) 信息中显示设备异性比率(xxxxxxxxx) 数据。