差异私人边际基于数据综合分析的统计理论 (Statistical Theory of Differentially Private Marginal-based Data Synthesis Algorithms) - 专知论文

会员服务 ·

0

统计量 · 贝叶斯网/贝叶斯网络 · Performer · 模型评估 · Networking ·

2023 年 1 月 21 日

Statistical Theory of Differentially Private Marginal-based Data Synthesis Algorithms

翻译：差异私人边际基于数据综合分析的统计理论

Ximing Li,Chendi Wang,Guang Cheng

Marginal-based methods achieve promising performance in the synthetic data competition hosted by the National Institute of Standards and Technology (NIST). To deal with high-dimensional data, the distribution of synthetic data is represented by a probabilistic graphical model (e.g., a Bayesian network), while the raw data distribution is approximated by a collection of low-dimensional marginals. Differential privacy (DP) is guaranteed by introducing random noise to each low-dimensional marginal distribution. Despite its promising performance in practice, the statistical properties of marginal-based methods are rarely studied in the literature. In this paper, we study DP data synthesis algorithms based on Bayesian networks (BN) from a statistical perspective. We establish a rigorous accuracy guarantee for BN-based algorithms, where the errors are measured by the total variation (TV) distance or the $L^2$ distance. Related to downstream machine learning tasks, an upper bound for the utility error of the DP synthetic data is also derived. To complete the picture, we establish a lower bound for TV accuracy that holds for every $\epsilon$-DP synthetic data generator.

翻译：以边际为基础的方法在国家标准和技术研究所(NIST)主持的合成数据竞争中取得了有希望的成绩。为了处理高维数据,合成数据的分布以概率图形模型(例如巴伊西亚网络)为代表,而原始数据分布则以低维边际的收集为近似。不同的隐私(DP)是通过对每个低维边际分布采用随机噪音来保证的。尽管其实际表现有希望,但在文献中很少研究边际方法的统计特性。我们从统计角度研究基于巴伊西亚网络的DP数据合成算法(BN)。我们为基于BN的算法制定了严格的准确性保证,在这种算法上错误是以总变异(TV)距离或$L%2美元距离衡量的。与下游机器学习任务有关的是,也从DP合成数据效用错误的上限中得出。为了完成这一图象,我们为每一台$\ epslon-DP合成数据生成器设定了较低的电视精确度。

0

相关内容

统计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

SRC-1介导Wnt/β-catenin信号通路在恐惧记忆再巩固中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

DRD2-Ca2+信号通路对PTSD所致学习记忆障碍的调控作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

固定化离子液体调控酶催化甘油解反应产物组成的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

ABCA1甲基化在动脉粥样硬化中的作用及miR-155靶向调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

硫化氢在肝癌细胞乏氧辐射耐受中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Nrf2在氢气治疗脊髓缺血再灌注损伤中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

FGFR4/CMC-HPLC/MS筛选白芥子促肿瘤细胞凋亡活性成分研究

国家自然科学基金

0+阅读 · 2012年12月31日

Anginex重组腺相关病毒抗血管生成信号转导通路的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Arxiv

0+阅读 · 2023年3月14日

Bounds and Algorithms for Frameproof Codes and Related Combinatorial Structures

Arxiv

0+阅读 · 2023年3月13日

Score Attack: A Lower Bound Technique for Optimal Differentially Private Learning

Arxiv

0+阅读 · 2023年3月13日

AutoOptLib: A Library of Automatically Designing Metaheuristic Optimization Algorithms in MATLAB

Arxiv

0+阅读 · 2023年3月12日

Approximate Regions of Attraction in Learning with Decision-Dependent Distributions

Arxiv

0+阅读 · 2023年3月10日

Distributed and Deep Vertical Federated Learning with Big Data

Arxiv

0+阅读 · 2023年3月10日

A novel notion of barycenter for probability distributions based on optimal weak mass transport

Arxiv

0+阅读 · 2023年3月10日

Distributionally Robust Optimization with Probabilistic Group

Arxiv

0+阅读 · 2023年3月10日

A model and method for analyzing the precision of binary measurement methods based on beta-binomial distributions, and related statistical tests

Arxiv

0+阅读 · 2023年3月10日

Tailoring Gradient Methods for Differentially-Private Distributed Optimization

Arxiv

0+阅读 · 2023年3月7日

VIP会员

文章信息

相关主题

贝叶斯网/贝叶斯网络

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Arxiv

0+阅读 · 2023年3月14日

Bounds and Algorithms for Frameproof Codes and Related Combinatorial Structures

Arxiv

0+阅读 · 2023年3月13日

Score Attack: A Lower Bound Technique for Optimal Differentially Private Learning

Arxiv

0+阅读 · 2023年3月13日

AutoOptLib: A Library of Automatically Designing Metaheuristic Optimization Algorithms in MATLAB

Arxiv

0+阅读 · 2023年3月12日

Approximate Regions of Attraction in Learning with Decision-Dependent Distributions

Arxiv

0+阅读 · 2023年3月10日

Distributed and Deep Vertical Federated Learning with Big Data

Arxiv

0+阅读 · 2023年3月10日

A novel notion of barycenter for probability distributions based on optimal weak mass transport

Arxiv

0+阅读 · 2023年3月10日

Distributionally Robust Optimization with Probabilistic Group

Arxiv

0+阅读 · 2023年3月10日

A model and method for analyzing the precision of binary measurement methods based on beta-binomial distributions, and related statistical tests

Arxiv

0+阅读 · 2023年3月10日

Tailoring Gradient Methods for Differentially-Private Distributed Optimization

Arxiv

0+阅读 · 2023年3月7日

相关基金

SRC-1介导Wnt/β-catenin信号通路在恐惧记忆再巩固中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

DRD2-Ca2+信号通路对PTSD所致学习记忆障碍的调控作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

固定化离子液体调控酶催化甘油解反应产物组成的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

ABCA1甲基化在动脉粥样硬化中的作用及miR-155靶向调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

硫化氢在肝癌细胞乏氧辐射耐受中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Nrf2在氢气治疗脊髓缺血再灌注损伤中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

FGFR4/CMC-HPLC/MS筛选白芥子促肿瘤细胞凋亡活性成分研究

国家自然科学基金

0+阅读 · 2012年12月31日

Anginex重组腺相关病毒抗血管生成信号转导通路的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员