Federated learning (FL) is a promising privacy-preserving distributed machine learning methodology that allows multiple clients (i.e., workers) to collaboratively train statistical models without disclosing private training data. Due to the characteristics of data remaining localized and the uninspected on-device training process, there may exist Byzantine workers launching data poisoning and model poisoning attacks, which would seriously deteriorate model performance or prevent the model from convergence. Most of the existing Byzantine-robust FL schemes are either ineffective against several advanced poisoning attacks or need to centralize a public validation dataset, which is intractable in FL. Moreover, to the best of our knowledge, none of the existing Byzantine-robust distributed learning methods could well exert its power in Non-Independent and Identically distributed (Non-IID) data among clients. To address these issues, we propose FedCom, a novel Byzantine-robust federated learning framework by incorporating the idea of commitment from cryptography, which could achieve both data poisoning and model poisoning tolerant FL under practical Non-IID data partitions. Specifically, in FedCom, each client is first required to make a commitment to its local training data distribution. Then, we identify poisoned datasets by comparing the Wasserstein distance among commitments submitted by different clients. Furthermore, we distinguish abnormal local model updates from benign ones by testing each local model's behavior on its corresponding data commitment. We conduct an extensive performance evaluation of FedCom. The results demonstrate its effectiveness and superior performance compared to the state-of-the-art Byzantine-robust schemes in defending against typical data poisoning and model poisoning attacks under practical Non-IID data distributions.
翻译:联邦学习(FL)是一种很有希望的隐私保护分布式机器学习方法,它使多个客户(即工人)能够在不披露私人培训数据的情况下合作培训统计模型。由于数据仍然局部化的特点以及未事先检查的系统化培训过程,可能存在拜占庭工人,他们发起数据中毒和模式中毒袭击,这将严重恶化模型性能或防止模式趋同。现有的Byzantine-robust FL计划多数要么对几起高级中毒袭击无效,要么需要集中一个公共验证数据集,这在FL中是难以做到的。此外,据我们所知,现有的Byzantine-robust分发的学习方法中,没有一个能够很好地在客户中运用其不依赖性和同样分布的数据(非IID)数据。为了解决这些问题,我们提议建立一个全新的Byzantine-robetty-busterated 学习模型模型,通过纳入来自隐性测定的理念,从而实现数据中毒和模式化FL在实际非II数据分布区间难以理解。具体地,在Fedal-Bal be deviewal be deal be develilal be deal be dal be deal devidustration sal deal destress deal deal destress a dal destration s destress a dal destrevation ex ex ex ex ex ex ex ex ex ex ex be ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex expal deal deal deal ex ex exmentmentmentmentmentmentmentmentmentmentmentmentmental exments ex ex extra ex ex ex ex ex ex exfol ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex exments ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex exeral ex ex ex ex ex ex ex ex