Federated learning (FL) is a popular distributed machine learning (ML) paradigm, but is often limited by significant communication costs and edge device computation capabilities. Federated Split Learning (FSL) preserves the parallel model training principle of FL, with a reduced device computation requirement thanks to splitting the ML model between the server and clients. However, FSL still incurs very high communication overhead due to transmitting the smashed data and gradients between the clients and the server in each global round. Furthermore, the server has to maintain separate models for every client, resulting in a significant computation and storage requirement that grows linearly with the number of clients. This paper tries to solve these two issues by proposing a communication and storage efficient federated and split learning (CSE-FSL) strategy, which utilizes an auxiliary network to locally update the client models while keeping only a single model at the server, hence avoiding the communication of gradients from the server and greatly reducing the server resource requirement. Communication cost is further reduced by only sending the smashed data in selected epochs from the clients. We provide a rigorous theoretical analysis of CSE-FSL that guarantees its convergence for non-convex loss functions. Extensive experimental results demonstrate that CSE-FSL has a significant communication reduction over existing FSL techniques while achieving state-of-the-art convergence and model accuracy, using several real-world FL tasks.
翻译:联邦学习(FL)是一个广受欢迎的分布式机器学习模式,但往往受到通信成本和边际设备计算能力的巨大限制。联邦分裂学习(FSL)保留了FL的平行模式培训原则,由于服务器和客户将ML模式分开,降低了设备计算要求。然而,FSL由于在每一个全球回合中传输客户和服务器之间被打碎的数据和梯度,仍然产生非常高的通信管理费用。此外,服务器必须维持每个客户的单独模式,从而导致大量计算和存储需求,与客户数量成直线增长。本文试图解决这两个问题,办法是提出高效的通信和存储封存联合和分散学习(CSE-FSL)战略,利用辅助网络在当地更新客户模式,同时在服务器上只保留一个单一模式,从而避免服务器用户与服务器之间的梯度通信,从而大大减少服务器资源需求。此外,服务器必须维持每个客户在选定的用户中发送的碎数据,从而进一步降低通信成本。我们对CSE-FSE-FSL的严格理论分析,以保障其非FSE-L现有降低成本模式的精确性,同时使用多种实验-FSEFSEFSE-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-L的降低现有降低损失状态的精确性工作。