Federated Learning (FL) enables multiple clients to collaboratively train a shared model while preserving data privacy. However, the high memory demand during model training severely limits the deployment of FL on resource-constrained clients. To this end, we propose \our, a scalable and inclusive FL framework designed to overcome memory limitations through sequential block-wise training. The core idea of \our is to partition the global model into blocks and train them sequentially, thereby reducing training memory requirements. To mitigate information loss during block-wise training, \our introduces a Curriculum Mentor that crafts curriculum-aware training objectives for each block to steer their learning process. Moreover, \our incorporates a Training Harmonizer that designs a parameter co-adaptation training scheme to coordinate block updates, effectively breaking inter-block information isolation. Extensive experiments on both simulation and hardware testbeds demonstrate that \our significantly improves model performance by up to 84.2\%, reduces peak memory usage by up to 50.4\%, and accelerates training by up to 1.9$\times$.
翻译:联邦学习(Federated Learning, FL)允许多个客户端在保护数据隐私的同时协作训练一个共享模型。然而,模型训练过程中的高内存需求严重限制了FL在资源受限客户端上的部署。为此,我们提出\our,一个可扩展且包容的FL框架,旨在通过顺序分块训练来克服内存限制。\our的核心思想是将全局模型划分为多个块并顺序训练它们,从而降低训练内存需求。为了缓解分块训练期间的信息损失,\our引入了一个课程导师(Curriculum Mentor),为每个块制定课程感知的训练目标以引导其学习过程。此外,\our还包含一个训练协调器(Training Harmonizer),其设计了一种参数协同适应训练方案来协调块更新,有效打破了块间信息隔离。在仿真和硬件测试平台上的大量实验表明,\our将模型性能显著提升高达84.2%,峰值内存使用降低高达50.4%,并将训练速度加快高达1.9$\times$。