Distributed machine learning has been widely used in recent years to tackle the large and complex dataset problem. Therewith, the security of distributed learning has also drawn increasing attentions from both academia and industry. In this context, federated learning (FL) was developed as a "secure" distributed learning by maintaining private training data locally and only public model gradients are communicated between. However, to date, a variety of gradient leakage attacks have been proposed for this procedure and prove that it is insecure. For instance, a common drawback of these attacks is shared: they require too much auxiliary information such as model weights, optimizers, and some hyperparameters (e.g., learning rate), which are difficult to obtain in real situations. Moreover, many existing algorithms avoid transmitting model gradients in FL and turn to sending model weights, such as FedAvg, but few people consider its security breach. In this paper, we present two novel frameworks to demonstrate that transmitting model weights is also likely to leak private local data of clients, i.e., (DLM and DLM+), under the FL scenario. In addition, a number of experiments are performed to illustrate the effect and generality of our attack frameworks. At the end of this paper, we also introduce two defenses to the proposed attacks and evaluate their protection effects. Comprehensively, the proposed attack and defense schemes can be applied to the general distributed learning scenario as well, just with some appropriate customization.
翻译:近些年来,广泛使用分散式机器学习来解决庞大和复杂的数据集问题,因此,分散式学习的安全性也引起学术界和工业界越来越多的注意,在这方面,通过维护本地的私人培训数据,将联合学习(FL)发展成“安全”的分布式学习,在本地维持私人培训数据,只有公共模型梯度之间交流。然而,迄今为止,为这一程序提出了各种梯度渗漏攻击,并证明它不安全。例如,这些攻击的共同缺点是:它们需要太多的辅助信息,如模型重量、优化器和一些超参数(例如学习率),在现实情况下很难获得。此外,许多现有的算法避免在FL中传播模型梯度,转而发送模型重量,如FedAvg,但很少有人认为它的安全性受损。在本文中,我们提出了两个新的框架,以证明传输模型重量还有可能泄漏客户的本地本地数据,即模型、优化型(DLM和DLM+)等,它们需要太多的辅助信息,而实际上很难获得。此外,在FL设想下,许多现有的算法方法避免传播模型梯度试验,作为我们提出的一般的防御性攻击的蓝图。