Many of the machine learning tasks rely on centralized learning (CL), which requires the transmission of local datasets from the clients to a parameter server (PS) entailing huge communication overhead. To overcome this, federated learning (FL) has been suggested as a promising tool, wherein the clients send only the model updates to the PS instead of the whole dataset. However, FL demands powerful computational resources from the clients. In practice, not all the clients have sufficient computational resources to participate in training. To address this common scenario, we propose a more efficient approach called hybrid federated and centralized learning (HFCL), wherein only the clients with sufficient resources employ FL, while the remaining ones send their datasets to the PS, which computes the model on behalf of them. Then, the model parameters are aggregated at the PS. To improve the efficiency of dataset transmission, we propose two different techniques: i) increased computation-per-client and ii) sequential data transmission. Notably, the HFCL frameworks outperform FL with up to 20\% improvement in the learning accuracy when only half of the clients perform FL while having 50\% less communication overhead than CL since all the clients collaborate on the learning process with their datasets.
翻译:许多机器学习任务都依靠集中学习(CL),这要求将客户的本地数据集传输到需要巨额通信管理费的参数服务器(PS),为解决这一问题,建议联合学习(FL)是一个很有希望的工具,客户只能向PS发送模型更新,而不是整个数据集。然而,FL要求客户提供强大的计算资源。实际上,并非所有客户都有足够的计算资源来参加培训。为了应对这一共同的情景,我们建议采用一种效率更高的方法,称为混合联合和集中学习(HFCL),即只有拥有足够资源的客户才使用FL,而其余客户则将其数据集发送给PS,后者代表他们计算模型。然后,模型参数在PS汇总。为了提高数据集传输的效率,我们提出了两种不同的技术:一)增加计算-每个客户的计算资源,二)按顺序传输数据。值得注意的是,HFLFL框架超越FL,在学习准确度方面改进了20 ⁇,因为只有半数客户在进行FL工作时,而他们则与50 ⁇ 的客户在C管理上学习了所有数据管理费。