Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.
翻译:联邦学习(FL)是一个分散的机械学习模式,全球服务器在不访问当地用户的数据的情况下迭代平均其模型参数。用户差异性给FL带来了巨大的挑战,因为FL可能带来漂流的全球模型,而这种模型的趋同速度缓慢。最近出现了知识蒸馏,通过利用来自不同用户的综合知识来完善服务器模型来解决这一问题,而不是直接平均其模型参数。然而,这种方法取决于代理数据集,除非满足这一先决条件,否则就变得不切实际。此外,整体知识没有被充分利用来指导当地模型学习,而这反过来又会影响综合模型的质量。在前一艺术的启发下,我们提出了一种无数据知识蒸馏法,以解决多种不同的FL问题,即服务器学习了轻量的生成器,以无数据的方式将用户信息集中起来,然后向用户广播,用所学知识调节当地培训,作为诱导的偏差。根据理论影响进行的研究显示,我们的方法有利于FL,用较少的通信轮进行更好的普及性工作,而与状态相比。