个人化联邦学习的团体隐私 (Group privacy for personalized federated learning)

Federated learning is a type of collaborative machine learning, where participating clients process their data locally, sharing only updates to the collaborative model. This enables to build privacy-aware distributed machine learning models, among others. The goal is the optimization of a statistical model's parameters by minimizing a cost function of a collection of datasets which are stored locally by a set of clients. This process exposes the clients to two issues: leakage of private information and lack of personalization of the model. On the other hand, with the recent advancements in techniques to analyze data, there is a surge of concern for the privacy violation of the participating clients. To mitigate this, differential privacy and its variants serve as a standard for providing formal privacy guarantees. Often the clients represent very heterogeneous communities and hold data which are very diverse. Therefore, aligned with the recent focus of the FL community to build a framework of personalized models for the users representing their diversity, it is also of utmost importance to protect against potential threats against the sensitive and personal information of the clients. $d$-privacy, which is a generalization of geo-indistinguishability, the lately popularized paradigm of location privacy, uses a metric-based obfuscation technique that preserves the spatial distribution of the original data. To address the issue of protecting the privacy of the clients and allowing for personalized model training to enhance the fairness and utility of the system, we propose a method to provide group privacy guarantees exploiting some key properties of $d$-privacy which enables personalized models under the framework of FL. We provide with theoretical justifications to the applicability and experimental validation on real-world datasets to illustrate the working of the proposed method.

翻译：联邦学习是一种协作性机器学习,参与的客户在当地处理数据,只分享合作模式的更新,从而能够建立注意到隐私的分布式机器学习模型等;目标是通过最大限度地减少收集由一组客户在当地储存的数据集的成本功能,优化统计模式参数,最大限度地减少收集由一组客户在当地储存的数据集的成本功能;这一过程使客户面临两个问题:私人信息泄漏和模型缺乏个性化。另一方面,随着最近数据分析技术的进步,参与的客户隐私受到侵犯的问题急剧增加。为了减少这种情况,差异性隐私及其变异性成为提供正式隐私保障的标准。客户往往代表非常多样化的社区并持有非常多样化的数据。因此,与FL社区最近为代表其多样性的用户建立个性化模型框架相一致,也极为重要的是保护客户免受客户敏感和个人信息工作信息的潜在威胁。 $dserveility提供了地理不稳定性的概括化,最近普及性隐私的公平性及其变异性变异性作为提供正式隐私保障的标准,用于保护个人隐私的原创性理论模式,用于维护个人数据流流的原始版的原始数据流。