Federated learning is an emerging distributed machine learning framework aiming at protecting data privacy. Data heterogeneity is one of the core challenges in federated learning, which could severely degrade the convergence rate and prediction performance of deep neural networks. To address this issue, we develop a novel personalized federated learning framework for heterogeneous data, which we refer to as FedSplit. This modeling framework is motivated by the finding that, data in different clients contain both common knowledge and personalized knowledge. Then the hidden elements in each neural layer can be split into the shared and personalized groups. With this decomposition, a novel objective function is established and optimized. We demonstrate FedSplit enjoyers a faster convergence speed than the standard federated learning method both theoretically and empirically. The generalization bound of the FedSplit method is also studied. To practically implement the proposed method on real datasets, factor analysis is introduced to facilitate the decoupling of hidden elements. This leads to a practically implemented model for FedSplit and we further refer to as FedFac. We demonstrated by simulation studies that, using factor analysis can well recover the underlying shared/personalized decomposition. The superior prediction performance of FedFac is further verified empirically by comparison with various state-of-the-art federated learning methods on several real datasets.
翻译:联邦学习是一种新兴的分布式机器学习框架,旨在保护数据隐私。数据异构性是联邦学习的核心挑战之一,可能严重降低深度神经网络的收敛速度与预测性能。为解决此问题,我们开发了一种面向异构数据的个性化联邦学习框架,称为FedSplit。该建模框架基于以下发现:不同客户端的数据同时包含共性知识与个性化知识。因此,每个神经层的隐藏单元可分解为共享组与个性化组。基于此分解,我们建立了新的目标函数并进行优化。理论分析与实验验证均表明,FedSplit相比标准联邦学习方法具有更快的收敛速度。同时,本文研究了FedSplit方法的泛化误差界。为在实际数据集上实现该方法,我们引入因子分析以辅助隐藏单元的解耦,由此构建出可实际运行的FedSplit模型,并将其进一步命名为FedFac。仿真实验表明,因子分析能有效还原潜在的共享/个性化分解结构。通过在多个真实数据集上与多种先进联邦学习方法进行比较,实证结果进一步验证了FedFac优异的预测性能。