Federated learning (FL) enables collaborative training of machine learning models while protecting the privacy of data. Traditional FL heavily relies on a trusted centralized server. It is vulnerable to poisoning attacks, the sharing of raw model updates puts the private training data under the risk of being reconstructed, and it suffers from an efficiency problem due to heavy communication cost. Although decentralized FL eliminates the central dependence, it may worsen the other problems due to insufficient constraints on the behavior of participants and distributed consensus on the global model update. In this paper, we propose a blockchain-based fully decentralized peer-to-peer (P2P) framework for FL, called BlockDFL for short. It leverages blockchain to force participants to behave well. It integrates gradient compression and our designed voting mechanism to coordinate decentralized FL among peer participants without mutual trust, while preventing data from being reconstructed from transmitted model updates. Extensive experiments conducted on two real-world datasets exhibit that BlockDFL obtains competitive accuracy compared to centralized FL and can defend poisoning attacks while achieving efficiency and scalability. Especially when the proportion of malicious participants is as high as 40%, BlockDFL can still preserve the accuracy of FL, outperforming existing fully decentralized FL frameworks based on blockchain.
翻译:联邦学习(Federated learning, FL) 可以协同训练机器学习模型,同时保护数据的隐私性。传统的FL很大程度上依赖于一个可信的中心化服务器,容易受到攻击、共享原始模型更新风险将私有训练数据的风险重新构建,因此通信成本很高,效率低。虽然分散的FL消除了中央的依赖,但由于对参与者行为的限制不足以及全局模型更新上的分布式共识可能加剧其他问题。本文提出了一个基于区块链完全分布式点到点(P2P) FL框架,称为BlockDFL。它利用区块链强制参与者遵守规则,通过梯度压缩和我们设计的投票机制结合以协调点对点的分散式联邦学习,而不需要相互信任,同时防止数据被从传输的模型更新中再次构建。在两个真实数据集上进行的大量实验表明,BlockDFL获得了与中心化FL相当的准确性,并且可以防御攻击和实现效率和可伸缩性。特别是当恶意参与者的比例高达40%时,BlockDFL仍然可以保持FL的准确性,优于现有的基于区块链的完全分散FL框架。