Privacy in Federated Learning (FL) is studied at two different granularities: item-level, which protects individual data points, and user-level, which protects each user (participant) in the federation. Nearly all of the private FL literature is dedicated to studying privacy attacks and defenses at these two granularities. Recently, subject-level privacy has emerged as an alternative privacy granularity to protect the privacy of individuals (data subjects) whose data is spread across multiple (organizational) users in cross-silo FL settings. An adversary might be interested in recovering private information about these individuals (a.k.a. \emph{data subjects}) by attacking the trained model. A systematic study of these patterns requires complete control over the federation, which is impossible with real-world datasets. We design a simulator for generating various synthetic federation configurations, enabling us to study how properties of the data, model design and training, and the federation itself impact subject privacy risk. We propose three attacks for \emph{subject membership inference} and examine the interplay between all factors within a federation that affect the attacks' efficacy. We also investigate the effectiveness of Differential Privacy in mitigating this threat. Our takeaways generalize to real-world datasets like FEMNIST, giving credence to our findings.
翻译:联邦学习联合会(FL)的隐私在两种不同的微粒上研究:保护个别数据点的物品级和保护联邦每个用户(参与者)的用户级。几乎所有私人FL文献都致力于研究这两个微粒的隐私攻击和防护。最近,主题性隐私作为保护个人隐私(数据主题)的替代隐私微粒出现,以保护数据在跨西罗FL环境的多个(组织)用户中传播的个人(数据主题)的隐私。对手可能有兴趣通过攻击经过训练的模型来恢复关于这些个人的私人信息(a.k.a.\emph{data subject}。系统研究这些模式需要完全控制联邦,这与现实世界的数据集是不可能的。我们设计了一种模拟器,用于生成各种合成联邦结构,使我们能够研究数据、模型设计和培训的特性以及联邦本身对隐私风险的影响。我们提议了三次攻击,以便通过攻击模式来恢复这些个人(a.k.a.a.emph.emph{datasumber)的私人信息。我们还要研究联邦内所有因素之间的相互作用,从而降低我们的全球数据的效力。我们还调查如何降低现实。