Machine learning (ML) models trained on personal data have been shown to leak information about users. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. Each new model trained with DP increases the bound on data leakage and can be seen as consuming part of a global privacy budget that should not be exceeded. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. This distinction forces a re-design of the scheduler. We present DPF (Dominant Private Block Fairness) -- a variant of the popular Dominant Resource Fairness (DRF) algorithm -- that is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. This is especially true for DPF over R\'enyi DP, a highly composable form of DP.
翻译:在个人数据方面受过培训的机器学习模式(ML) 已经显示会泄漏关于用户的信息。 差异隐私 (DP) 能够进行模型培训, 并有保证地约束这种泄漏。 在DP 下受过培训的每一个新模式都会增加数据泄漏的束缚, 并且可以被视为消耗全球隐私预算的一部分, 不应超过。 这个预算是一个稀缺的资源, 必须谨慎管理, 才能最大限度地增加成功培训模式的数量。 我们描述私人Kubernetes 数据中心的扩展, 它将隐私作为新类型的资源, 与其他传统的计算资源( 如CPU、 GPU 和记忆)一起管理。 我们为 Kubernets定义的隐私资源镜设计了抽象的镜像, 但也存在重大差异。 例如, 传统的计算资源可以被补充, 而隐私不是: 在模型完成执行后, 隐私预算可以重新恢复 。 这种区别迫使对调度器进行重新配置。 我们展示了DPF( Dminant Prial Broad Fall) -- 一种通用资源公平(DRF) 模式的变式, 而我们则则在数据库里程数据库中, 的IFRFSER 的理论分析中, 特别地对不使用。 进行高的IFIL 进行高的理论的理论分析。