PPPA: 对联邦学习的优先调查 (PPA: Preference Profiling Attack Against Federated Learning)

Federated learning (FL) trains a global model across a number of decentralized users, each with a local dataset. Compared to traditional centralized learning, FL does not require direct access to local datasets and thus aims to mitigate data privacy concerns. However, data privacy leakage in FL still exists due to inference attacks, including membership inference, property inference, and data inversion. In this work, we propose a new type of privacy inference attack, coined Preference Profiling Attack (PPA), that accurately profiles the private preferences of a local user, e.g., most liked (disliked) items from the client's online shopping and most common expressions from the user's selfies. In general, PPA can profile top-k (i.e., k = 1, 2, 3 and k = 1 in particular) preferences contingent on the local client (user)'s characteristics. Our key insight is that the gradient variation of a local user's model has a distinguishable sensitivity to the sample proportion of a given class, especially the majority (minority) class. By observing a user model's gradient sensitivity to a class, PPA can profile the sample proportion of the class in the user's local dataset, and thus the user's preference of the class is exposed. The inherent statistical heterogeneity of FL further facilitates PPA. We have extensively evaluated the PPA's effectiveness using four datasets (MNIST, CIFAR10, RAF-DB and Products-10K). Our results show that PPA achieves 90% and 98% top-1 attack accuracy to the MNIST and CIFAR10, respectively. More importantly, in real-world commercial scenarios of shopping (i.e., Products-10K) and social network (i.e., RAF-DB), PPA gains a top-1 attack accuracy of 78% in the former case to infer the most ordered items (i.e., as a commercial competitor), and 88% in the latter case to infer a victim user's most often facial expressions, e.g., disgusted.

翻译：联邦学习( FL) 在多个分散用户中培养一个全球模型, 每个用户都有本地数据集。与传统的中央化学习相比, FL 不需要直接访问本地数据集, 从而减少数据隐私的担忧。但是, FL 中的数据隐私渗漏仍然存在, 原因是推断攻击, 包括会籍推断、财产推断和数据倒置。在这项工作中, 我们建议一种新的隐私推断攻击, 创建 Ppecial Profilting 攻击 (PPA), 准确描述本地用户的私人偏好, 例如, 客户最喜欢( 不喜欢) 的在线购物, 用户自我选择的最常见的表达方式。一般来说, PPPPA 中的数据隐私渗漏( k= 1, 2, 3 和 k= 1) 取决于本地客户( 用户) 的特性。我们的主要洞察力是, 本地用户模型的梯度变化对某类( 特别是多数( 88) ) 的抽样分析结果。在用户模型中, P- 将用户的直位( 直观) 、直观直观直观直观的直观直观直观直观直观直观直观直观直观直观、直观直观直观直观直观、直观直观直观直观直观直观直观直观直观、、直观直观直观直观直观直观直观直观直观直观、直观直观、直观直观直观直观直观直观、、、、、、、、、直观、、直观、、、、直观直观、直观直观直观直观直观直观直观、、直观、、、、、、直观直观直观直观直观直观直观直观直观直观直观直