隐私放大: 在分布式均值估计中实现最优隐私-准确性-通信平衡的压缩方法 (Privacy Amplification via Compression: Achieving the Optimal Privacy-Accuracy-Communication Trade-off in Distributed Mean Estimation) - 专知论文

会员服务 ·

0

联合学习 · 均值 · 最优 · 频率估计 · 差分 ·

2023 年 4 月 4 日

Privacy Amplification via Compression: Achieving the Optimal Privacy-Accuracy-Communication Trade-off in Distributed Mean Estimation

翻译：隐私放大: 在分布式均值估计中实现最优隐私-准确性-通信平衡的压缩方法

Wei-Ning Chen,Dan Song,Ayfer Ozgur,Peter Kairouz

Privacy and communication constraints are two major bottlenecks in federated learning (FL) and analytics (FA). We study the optimal accuracy of mean and frequency estimation (canonical models for FL and FA respectively) under joint communication and $(\varepsilon, \delta)$-differential privacy (DP) constraints. We show that in order to achieve the optimal error under $(\varepsilon, \delta)$-DP, it is sufficient for each client to send $\Theta\left( n \min\left(\varepsilon, \varepsilon^2\right)\right)$ bits for FL and $\Theta\left(\log\left( n\min\left(\varepsilon, \varepsilon^2\right) \right)\right)$ bits for FA to the server, where $n$ is the number of participating clients. Without compression, each client needs $O(d)$ bits and $\log d$ bits for the mean and frequency estimation problems respectively (where $d$ corresponds to the number of trainable parameters in FL or the domain size in FA), which means that we can get significant savings in the regime $ n \min\left(\varepsilon, \varepsilon^2\right) = o(d)$, which is often the relevant regime in practice. Our algorithms leverage compression for privacy amplification: when each client communicates only partial information about its sample, we show that privacy can be amplified by randomly selecting the part contributed by each client.

翻译：隐私和通信限制是联合学习和分析中的两个主要瓶颈。我们研究了在联合通信和 (ε，δ)-差分隐私限制下均值和频率估计的最优准确性。它们是联合学习和分析的模板。我们展示了，在达到 (ε，δ)-差分隐私下的最优误差时，每个客户端只需要向服务器发送 $\Theta(n \min(\varepsilon, \varepsilon^2))$ 位用于联合学习和 $\Theta(\log(n\min(\varepsilon, \varepsilon^2)))$ 位用于分析，其中 $n$ 是参与客户端的数量。如果没有压缩，每个客户端需要 $O(d)$ 位和 $\log d$ 位，用于均值和频率估计问题，其中 $d$ 对应于联合学习中的可训练参数数目或分析中的域大小，这意味着在实践中通常存在的 $n\min(\varepsilon, \varepsilon^2) = o(d)$ 的方案中我们可以获得显著的节省。我们的算法利用压缩来进行隐私放大：当每个客户端仅交换其样本的部分信息时，我们展示了通过随机选择每个客户端贡献的部分来放大隐私。

0

相关内容

联合学习

如何理解对抗鲁棒性和差分隐私？【MIT】鲁棒性意味着统计估计中的隐私，87页pdf

如何理解对抗鲁棒性和差分隐私？【MIT】鲁棒性意味着统计估计中的隐私，87页pdf

专知会员服务

17+阅读 · 2023年1月11日

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

专知会员服务

27+阅读 · 2020年6月10日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

【百度】-大规模深度学习广告系统的分布式分层GPU参数服务器，Distributed Hierarchical GPU PS

专知会员服务

24+阅读 · 2020年3月15日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

专知会员服务

15+阅读 · 2019年11月18日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

动手实现推荐系统评价指标

动手实现推荐系统评价指标

机器学习与推荐算法

1+阅读 · 2022年6月1日

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

PaperWeekly

0+阅读 · 2022年5月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

再谈人脸识别损失函数综述

再谈人脸识别损失函数综述

人工智能前沿讲习班

14+阅读 · 2019年5月7日

已删除

德先生

53+阅读 · 2019年4月28日

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

人工智能前沿讲习班

172+阅读 · 2019年3月6日

PyTorch中在反向传播前为什么要手动将梯度清零？

PyTorch中在反向传播前为什么要手动将梯度清零？

极市平台

39+阅读 · 2019年1月23日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

分布式优化算法及其隐私保护策略研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于连续循环平移理论的Shearlet域稀疏表示SAR图像去噪算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于随机但可估计嵌入的安全数字水印算法

国家自然科学基金

0+阅读 · 2013年12月31日

无线传感器网络中功率受限的分布式矢量估计

国家自然科学基金

0+阅读 · 2013年12月31日

基于音节模型的音频点播关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类无线通信中的非凸矩阵优化问题及算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于集群OFDM的低功耗电力线通信收发端设计

国家自然科学基金

0+阅读 · 2013年12月31日

基于冗余信息及分布计算的站域协同保护研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于三维变换域医用体数据鲁棒数字水印算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Approximate degree lower bounds for oracle identification problems

Arxiv

0+阅读 · 2023年5月22日

The Rate-Distortion-Perception Trade-off with Side Information

Arxiv

0+阅读 · 2023年5月22日

FSSA: Efficient 3-Round Secure Aggregation for Privacy-Preserving Federated Learning

Arxiv

0+阅读 · 2023年5月22日

Federated Transfer-Ordered-Personalized Learning for Driver Monitoring Application

Arxiv

0+阅读 · 2023年5月22日

Matching Game for Optimized Association in Quantum Communication Networks

Arxiv

0+阅读 · 2023年5月22日

Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables

Arxiv

0+阅读 · 2023年5月22日

Optimal Privacy Preserving for Federated Learning in Mobile Edge Computing

Arxiv

0+阅读 · 2023年5月21日

Identification and multiply robust estimation in causal mediation analysis with treatment noncompliance

Arxiv

0+阅读 · 2023年5月20日

Off-policy evaluation beyond overlap: partial identification through smoothness

Arxiv

0+阅读 · 2023年5月19日

Towards Achieving Near-optimal Utility for Privacy-Preserving Federated Learning via Data Generation and Parameter Distortion

Arxiv

0+阅读 · 2023年5月19日

VIP会员

文章信息

相关主题

相关VIP内容

如何理解对抗鲁棒性和差分隐私？【MIT】鲁棒性意味着统计估计中的隐私，87页pdf

如何理解对抗鲁棒性和差分隐私？【MIT】鲁棒性意味着统计估计中的隐私，87页pdf

专知会员服务

17+阅读 · 2023年1月11日

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

【SIGIR2020】策略感知的无偏排序学习—Top-K排序，Policy-Aware Unbiased Learning to Rank for Top-𝑘 Rankings

专知会员服务

27+阅读 · 2020年6月10日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

【百度】-大规模深度学习广告系统的分布式分层GPU参数服务器，Distributed Hierarchical GPU PS

专知会员服务

24+阅读 · 2020年3月15日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

专知会员服务

15+阅读 · 2019年11月18日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

动手实现推荐系统评价指标

动手实现推荐系统评价指标

机器学习与推荐算法

1+阅读 · 2022年6月1日

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

PaperWeekly

0+阅读 · 2022年5月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

再谈人脸识别损失函数综述

再谈人脸识别损失函数综述

人工智能前沿讲习班

14+阅读 · 2019年5月7日

已删除

德先生

53+阅读 · 2019年4月28日

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

人工智能前沿讲习班

172+阅读 · 2019年3月6日

PyTorch中在反向传播前为什么要手动将梯度清零？

PyTorch中在反向传播前为什么要手动将梯度清零？

极市平台

39+阅读 · 2019年1月23日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

相关论文

Approximate degree lower bounds for oracle identification problems

Arxiv

0+阅读 · 2023年5月22日

The Rate-Distortion-Perception Trade-off with Side Information

Arxiv

0+阅读 · 2023年5月22日

FSSA: Efficient 3-Round Secure Aggregation for Privacy-Preserving Federated Learning

Arxiv

0+阅读 · 2023年5月22日

Federated Transfer-Ordered-Personalized Learning for Driver Monitoring Application

Arxiv

0+阅读 · 2023年5月22日

Matching Game for Optimized Association in Quantum Communication Networks

Arxiv

0+阅读 · 2023年5月22日

Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables

Arxiv

0+阅读 · 2023年5月22日

Optimal Privacy Preserving for Federated Learning in Mobile Edge Computing

Arxiv

0+阅读 · 2023年5月21日

Identification and multiply robust estimation in causal mediation analysis with treatment noncompliance

Arxiv

0+阅读 · 2023年5月20日

Off-policy evaluation beyond overlap: partial identification through smoothness

Arxiv

0+阅读 · 2023年5月19日

Towards Achieving Near-optimal Utility for Privacy-Preserving Federated Learning via Data Generation and Parameter Distortion

Arxiv

0+阅读 · 2023年5月19日

相关基金

分布式优化算法及其隐私保护策略研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于连续循环平移理论的Shearlet域稀疏表示SAR图像去噪算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于随机但可估计嵌入的安全数字水印算法

国家自然科学基金

0+阅读 · 2013年12月31日

无线传感器网络中功率受限的分布式矢量估计

国家自然科学基金

0+阅读 · 2013年12月31日

基于音节模型的音频点播关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类无线通信中的非凸矩阵优化问题及算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于集群OFDM的低功耗电力线通信收发端设计

国家自然科学基金

0+阅读 · 2013年12月31日

基于冗余信息及分布计算的站域协同保护研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于三维变换域医用体数据鲁棒数字水印算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员