Learning a privacy-preserving model from distributed sensitive data is an increasingly important problem, often formulated in the federated learning context. Variational inference has recently been extended to the non-private federated learning setting via the partitioned variational inference algorithm. For privacy protection, the current gold standard is called differential privacy. Differential privacy guarantees privacy in a strong, mathematically clearly defined sense. In this paper, we present differentially private partitioned variational inference, the first general framework for learning a variational approximation to a Bayesian posterior distribution in the federated learning setting while minimising the number of communication rounds and providing differential privacy guarantees for data subjects. We propose three alternative implementations in the general framework, one based on perturbing local optimisation done by individual parties, and two based on perturbing global updates (one using a version of federated averaging, one adding virtual parties to the protocol), and compare their properties both theoretically and empirically. We show that perturbing the local optimisation works well with simple and complex models as long as each party has enough local data. However, the privacy is always guaranteed independently by each party. In contrast, perturbing the global updates works best with relatively simple models. Given access to suitable secure primitives, such as secure aggregation or secure shuffling, the performance can be improved by all parties guaranteeing privacy jointly.
翻译:从分布式敏感数据中学习隐私保护模式是一个越来越重要的问题,通常是在联邦学习环境中制定的。最近,通过分解变异推断算法,将差异推论扩大到非私人联合学习环境。关于隐私保护,目前的黄金标准称为差异隐私。不同的隐私以强烈、数学上明确定义的意义上保障隐私。在本文中,我们提出了不同的私人分割变异推论,这是学习在联邦学习环境中对巴伊西亚后方分布的变异近近似的第一个总体框架,同时尽量减少通信轮数,并为数据科目提供不同的隐私保障。我们提议在总体框架内实施三种替代方案,其中一项基于单个政党对本地的优化,两项基于干扰全球更新(一个采用联邦平均版本,一个为协议添加虚拟方),并用理论上和实证上比较其属性。我们表明,对本地优化的组合与简单复杂的模型一起运作,只要每个政党都有足够安全的本地数据,那么,每个政党都以最安全的方式进行更安全地更新。但是,每个政党都以最安全的方式独立地保证使用最安全的版本。