抽样对地方差异私人数据收集的影响 (Impact of Sampling on Locally Differentially Private Data Collection) - 专知论文

会员服务 ·

0

Analysis · 估计/估计量 · 样本 · 无偏 · Learning ·

2022 年 6 月 2 日

Impact of Sampling on Locally Differentially Private Data Collection

翻译：抽样对地方差异私人数据收集的影响

Sayan Biswas,Graham Cormode,Carsten Maple

With the recent bloom of data, there is a huge surge in threats against individuals' private information. Various techniques for optimizing privacy-preserving data analysis are at the focus of research in the recent years. In this paper, we analyse the impact of sampling on the utility of the standard techniques of frequency estimation, which is at the core of large-scale data analysis, of the locally deferentially private data-release under a pure protocol. We study the case in a distributed environment of data sharing where the values are reported by various nodes to the central server, e.g., cross-device Federated Learning. We show that if we introduce some random sampling of the nodes in order to reduce the cost of communication, the standard existing estimators fail to remain unbiased. We propose a new unbiased estimator in the context of sampling each node with certain probability and compute various statistical summaries of the data using it. We propose a way of sampling each node with personalized sampling probabilities as a step to further generalisation, which leads to some interesting open questions in the end. We analyse the accuracy of our proposed estimators on synthetic datasets to gather some insight on the trade-off between communication cost, privacy, and utility.

翻译：由于最近数据泛滥,对个人私人信息的威胁急剧增加。近年来研究的重点是优化隐私保护数据分析的各种技术。在本文件中,我们分析取样对频率估计标准技术的效用的影响。频率估计标准技术是大规模数据分析的核心,而频率估计标准技术是大规模数据分析的核心,是根据纯议定书对当地顺从的私人数据释放的。我们研究数据共享分布式环境的情况,在这种环境中,各种节点将数值报告给中央服务器,例如交叉便利学习。我们表明,如果我们对节点进行一些随机抽样,以减少通信费用,现有标准估计者就无法保持公正。我们提议在对每个节点进行抽样时采用新的不偏袒的估算,以某种可能性对使用的数据进行不同的统计摘要进行计算。我们建议一种方法,对每个节点进行抽样的个化概率取样,作为进一步概括的一个步骤,从而导致一些有趣的公开问题。我们分析了我们提议的关于合成数据保密性、对合成数据的保密性分析的准确性。我们分析了关于合成数据收集的通信的保密性、对合成数据收集的保密性的分析。

0

相关内容

Analysis

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

扬子鳄种群进化历史及畸形衰退的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

含边界层与界面层的输运方程数值算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于故障跟踪估计器的高压直流输电系统故障诊断

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

艾比湖湿地边缘带景观格局演变与生态服务功能关系的定量研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于细胞膜表面电势和亚细胞分布的重金属生态迁移风险研究

国家自然科学基金

0+阅读 · 2011年12月31日

Toll 样受体介导的巨噬细胞对prion清除的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Stream-based active learning with linear models

Arxiv

0+阅读 · 2022年7月20日

Validating Causal Inference Methods

Arxiv

0+阅读 · 2022年7月20日

Turning the information-sharing dial: efficient inference from different data sources

Arxiv

0+阅读 · 2022年7月18日

Private Convex Optimization in General Norms

Arxiv

0+阅读 · 2022年7月18日

Large Language Models Can Be Strong Differentially Private Learners

Arxiv

0+阅读 · 2022年7月18日

Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning

Arxiv

0+阅读 · 2022年7月15日

Identifying and Quantifying Trade-offs in Multi-Stakeholder Risk Evaluation with Applications to the Data Protection Impact Assessment of the GDPR

Arxiv

0+阅读 · 2022年7月15日

Efficient and Privacy Preserving Group Signature for Federated Learning

Arxiv

0+阅读 · 2022年7月15日

Forecasting: theory and practice

Arxiv

57+阅读 · 2022年1月5日

Causality and Generalizability: Identifiability and Learning Methods

Arxiv

12+阅读 · 2021年10月4日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Stream-based active learning with linear models

Arxiv

0+阅读 · 2022年7月20日

Validating Causal Inference Methods

Arxiv

0+阅读 · 2022年7月20日

Turning the information-sharing dial: efficient inference from different data sources

Arxiv

0+阅读 · 2022年7月18日

Private Convex Optimization in General Norms

Arxiv

0+阅读 · 2022年7月18日

Large Language Models Can Be Strong Differentially Private Learners

Arxiv

0+阅读 · 2022年7月18日

Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning

Arxiv

0+阅读 · 2022年7月15日

Identifying and Quantifying Trade-offs in Multi-Stakeholder Risk Evaluation with Applications to the Data Protection Impact Assessment of the GDPR

Arxiv

0+阅读 · 2022年7月15日

Efficient and Privacy Preserving Group Signature for Federated Learning

Arxiv

0+阅读 · 2022年7月15日

Forecasting: theory and practice

Arxiv

57+阅读 · 2022年1月5日

Causality and Generalizability: Identifiability and Learning Methods

Arxiv

12+阅读 · 2021年10月4日

相关基金

扬子鳄种群进化历史及畸形衰退的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

含边界层与界面层的输运方程数值算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于故障跟踪估计器的高压直流输电系统故障诊断

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

艾比湖湿地边缘带景观格局演变与生态服务功能关系的定量研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于细胞膜表面电势和亚细胞分布的重金属生态迁移风险研究

国家自然科学基金

0+阅读 · 2011年12月31日

Toll 样受体介导的巨噬细胞对prion清除的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员