Most cross-device federated learning (FL) studies focus on the model-homogeneous setting where the global server model and local client models are identical. However, such constraint not only excludes low-end clients who would otherwise make unique contributions to model training but also restrains clients from training large models due to on-device resource bottlenecks. In this work, we propose FedRolex, a partial training (PT)-based approach that enables model-heterogeneous FL and can train a global server model larger than the largest client model. At its core, FedRolex employs a rolling sub-model extraction scheme that allows different parts of the global server model to be evenly trained, which mitigates the client drift induced by the inconsistency between individual client models and server model architectures. We show that FedRolex outperforms state-of-the-art PT-based model-heterogeneous FL methods (e.g. Federated Dropout) and reduces the gap between model-heterogeneous and model-homogeneous FL, especially under the large-model large-dataset regime. In addition, we provide theoretical statistical analysis on its advantage over Federated Dropout and evaluate FedRolex on an emulated real-world device distribution to show that FedRolex can enhance the inclusiveness of FL and boost the performance of low-end devices that would otherwise not benefit from FL. Our code is available at https://github.com/MSU-MLSys-Lab/FedRolex.
翻译:多数跨点联谊学习(FL)研究侧重于模型-混合式环境,即全球服务器模式和本地客户模式完全相同的模型-混合式环境;然而,这种限制不仅排除了低端客户,否则他们就会为模式培训做出独特的贡献,而且还限制客户培训大型模型,原因是在模式资源瓶颈上出现瓶颈。在这项工作中,我们提议FedRolex采用基于部分培训(PT)的方法,即基于部分培训(PT)的方法,使模型-异质FL能够培训比最大客户模式大得多的全球服务器模式。FedRolex采用一个滚动的次级模型提取计划,允许全球服务器模式的不同部分得到均衡培训,这可以缓解单个客户模式模式模式和服务器模式模式模式模式模式模式-模型模式模式模式模式模式-模块模式模式模式模式与本地客户模式-模块-模块-模块化模式模式模式模式模式-组合模式-FL的兼容性模式之间的差异。此外,Fed Rodelexlex超越了F-Fldal-Freal-modal-modal sal supal supal supal supal laction ex supal laction over the Unational-lax lax