Federated learning enables clients to collaboratively learn a shared global model without sharing their local training data with a cloud server. However, malicious clients can corrupt the global model to predict incorrect labels for testing examples. Existing defenses against malicious clients leverage Byzantine-robust federated learning methods. However, these methods cannot provably guarantee that the predicted label for a testing example is not affected by malicious clients. We bridge this gap via ensemble federated learning. In particular, given any base federated learning algorithm, we use the algorithm to learn multiple global models, each of which is learnt using a randomly selected subset of clients. When predicting the label of a testing example, we take majority vote among the global models. We show that our ensemble federated learning with any base federated learning algorithm is provably secure against malicious clients. Specifically, the label predicted by our ensemble global model for a testing example is provably not affected by a bounded number of malicious clients. Moreover, we show that our derived bound is tight. We evaluate our method on MNIST and Human Activity Recognition datasets. For instance, our method can achieve a certified accuracy of 88% on MNIST when 20 out of 1,000 clients are malicious.
翻译:联邦学习使客户能够合作学习共享的全球模型,而不必与云端服务器分享其本地培训数据。 但是,恶意客户可以腐蚀全球模型,以预测不正确标签的测试实例。 现有防恶意客户的防御手段利用了Byzantine- robust 联合学习方法。 但是,这些方法无法令人理解地保证测试示例的预期标签不会受到恶意客户的影响。 我们通过混合联合学习来弥补这一差距。 特别是,考虑到任何基础联合学习算法,我们使用算法来学习多个全球模型,每个模型都是通过随机选择的客户子来学习的。 在预测一个测试示例的标签时,我们在全球模型中进行多数的投票。 我们显示,我们用任何基础联合学习算法进行的联动学习对于恶意客户来说都是非常安全的。 具体而言,我们混合全球模型为测试示例所预测的标签不会受到恶意客户数目的约束。 此外,我们显示,我们衍生的模型的界限很紧凑紧,我们用MNIST和人类活动识别数据的精确度来评估我们的方法, 当我们验证了1 000个客户的精确度时, 。