Motivated by the ever-increasing concerns on personal data privacy and the rapidly growing data volume at local clients, federated learning (FL) has emerged as a new machine learning setting. An FL system is comprised of a central parameter server and multiple local clients. It keeps data at local clients and learns a centralized model by sharing the model parameters learned locally. No local data needs to be shared, and privacy can be well protected. Nevertheless, since it is the model instead of the raw data that is shared, the system can be exposed to the poisoning model attacks launched by malicious clients. Furthermore, it is challenging to identify malicious clients since no local client data is available on the server. Besides, membership inference attacks can still be performed by using the uploaded model to estimate the client's local data, leading to privacy disclosure. In this work, we first propose a model update based federated averaging algorithm to defend against Byzantine attacks such as additive noise attacks and sign-flipping attacks. The individual client model initialization method is presented to provide further privacy protections from the membership inference attacks by hiding the individual local machine learning model. When combining these two schemes, privacy and security can be both effectively enhanced. The proposed schemes are proved to converge experimentally under non-IID data distribution when there are no attacks. Under Byzantine attacks, the proposed schemes perform much better than the classical model based FedAvg algorithm.
翻译:由于日益关注个人数据隐私以及当地客户的数据量不断增加,因此出现了一种新的机器学习环境。FL系统由中央参数服务器和多个本地客户组成。它保留当地客户的数据,通过分享当地学习的模型参数学习集中模型。不需要分享当地数据,隐私也能得到很好的保护。然而,由于该系统是模型,而不是共享的原始数据,因此可能暴露于恶意客户发动的中毒模型袭击中。此外,由于服务器上没有本地客户数据,因此很难识别恶意客户。此外,使用上载模型来估计客户的本地数据仍然可以进行成员推论攻击,从而导致隐私披露。在这项工作中,我们首先提出基于基于基建模的联邦平均算法,以防范Byzantine袭击,例如添加噪音袭击和信号反弹动模型袭击。提出个人客户模式初始化方法是为了通过隐藏个人机器模型学习来进一步保护会员隐私不受攻击。此外,如果将这两种系统合并起来,隐私和安全系统将无法在两个基于实验性A的系统下进行更好的组合,那么,那么在两个基于实验性攻击的系统下,在进行更好的组合下进行更好的分配。