Federated learning has become a widely used paradigm for collaboratively training a common model among different participants with the help of a central server that coordinates the training. Although only the model parameters or other model updates are exchanged during the federated training instead of the participant's data, many attacks have shown that it is still possible to infer sensitive information such as membership, property, or outright reconstruction of participant data. Although differential privacy is considered an effective solution to protect against privacy attacks, it is also criticized for its negative effect on utility. Another possible defense is to use secure aggregation which allows the server to only access the aggregated update instead of each individual one, and it is often more appealing because it does not degrade model quality. However, combining only the aggregated updates, which are generated by a different composition of clients in every round, may still allow the inference of some client-specific information. In this paper, we show that simple linear models can effectively capture client-specific properties only from the aggregated model updates due to the linearity of aggregation. We formulate an optimization problem across different rounds in order to infer a tested property of every client from the output of the linear models, for example, whether they have a specific sample in their training data (membership inference) or whether they misbehave and attempt to degrade the performance of the common model by poisoning attacks. Our reconstruction technique is completely passive and undetectable. We demonstrate the efficacy of our approach on several scenarios which shows that secure aggregation provides very limited privacy guarantees in practice. The source code will be released upon publication.
翻译:联邦学习已成为在协调培训的中央服务器的帮助下,在不同参与者之间合作培训一个共同模式的广泛应用模式。虽然在联合培训期间,只交换模型参数或其他模型更新,而不是参与者的数据,但许多袭击表明,仍然有可能推断敏感信息,例如成员资格、财产或直接重建参与者数据等敏感信息。虽然认为不同的隐私是保护隐私免遭隐私攻击的有效解决办法,但也因其对实用性的负面影响而受到批评。另一种可能的辩护是使用安全集成,使服务器能够只访问汇总更新,而不是每个个人,而且由于它不会降低模型质量,因此往往更具吸引力。然而,仅将每轮客户的不同构成产生的汇总更新合并,仍有可能推断某些客户特定信息。在本文中,简单线性模型只能有效地从综合模型更新中获取特定客户的属性,因为它对实用性有负面影响。我们在不同回合中提出优化问题,以便从线性模型产出中测试每个客户的属性,而不是降低模型质量,而且由于它不会降低模型质量,因此,仅仅将综合性能展示我们的标准。我们从一个普通的模型中展示了一种精确性模型。</s>