Federated Learning (FL) is a machine learning paradigm where local nodes collaboratively train a central model while the training data remains decentralized. Existing FL methods typically share model parameters or employ co-distillation to address the issue of unbalanced data distribution. However, they suffer from communication bottlenecks. More importantly, they risk privacy leakage. In this work, we develop a privacy preserving and communication efficient method in a FL framework with one-shot offline knowledge distillation using unlabeled, cross-domain public data. We propose a quantized and noisy ensemble of local predictions from completely trained local models for stronger privacy guarantees without sacrificing accuracy. Based on extensive experiments on image classification and text classification tasks, we show that our privacy-preserving method outperforms baseline FL algorithms with superior performance in both accuracy and communication efficiency.
翻译:联邦学习(FL)是一种机器学习模式,当地节点在培训数据仍然分散的情况下合作培训一个中心模型。现有的FL方法通常共享模型参数,或采用共同蒸馏方法来解决数据分布不平衡的问题。然而,它们面临通信瓶颈问题,更重要的是,它们有隐私泄漏的风险。在这项工作中,我们在FL框架内开发了一种隐私保护和通信高效方法,使用未贴标签的跨域公共数据进行一次性离线知识蒸馏。我们建议用完全经过培训的当地模型进行量化和吵闹的本地预测组合,以便在不牺牲准确性的情况下加强隐私保障。根据关于图像分类和文本分类任务的广泛实验,我们显示我们的隐私保护方法优于基线FL算法,在准确性和通信效率方面表现优异。