隐私权:框架和机制 (Attribute Privacy: Framework and Mechanisms)

Ensuring the privacy of training data is a growing concern since many machine learning models are trained on confidential and potentially sensitive data. Much attention has been devoted to methods for protecting individual privacy during analyses of large datasets. However in many settings, global properties of the dataset may also be sensitive (e.g., mortality rate in a hospital rather than presence of a particular patient in the dataset). In this work, we depart from individual privacy to initiate the study of attribute privacy, where a data owner is concerned about revealing sensitive properties of a whole dataset during analysis. We propose definitions to capture \emph{attribute privacy} in two relevant cases where global attributes may need to be protected: (1) properties of a specific dataset and (2) parameters of the underlying distribution from which dataset is sampled. We also provide two efficient mechanisms and one inefficient mechanism that satisfy attribute privacy for these settings. We base our results on a novel use of the Pufferfish framework to account for correlations across attributes in the data, thus addressing "the challenging problem of developing Pufferfish instantiations and algorithms for general aggregate secrets" that was left open by \cite{kifer2014pufferfish}.

翻译：由于许多机器学习模型都对保密和潜在敏感数据进行了培训,因此,确保培训数据隐私的问题日益受到关注,因为许多机器学习模型都对保密和潜在敏感数据进行了培训。在分析大型数据集的过程中,对保护个人隐私的方法给予了极大关注。然而,在许多环境下,数据集的全球特性也可能是敏感的(例如,医院的死亡率,而不是数据集中特定病人的发病率)。在这项工作中,我们从个人隐私出发,开始研究属性隐私,而数据拥有者担心在分析过程中披露整个数据集的敏感特性。我们提议在两个可能需要保护全球属性的相关案例中,对获取\emph{atritte隐私提出定义:(1)特定数据集的特性和(2)作为数据集样本的基本分布参数。我们还提供了两个高效机制和一个效率低的机制,满足这些环境的隐私属性。我们的结果基于对普费鱼类框架的新使用,以说明数据中各属性之间的相互关系,从而解决“开发普费鱼类即时和算法的难题,因为一般总的秘密是由\cite{pufferffer}所开放的。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知会员服务

68+阅读 · 2020年4月28日

【微众银行】联邦学习白皮书_v2.0，48页pdf，

专知会员服务

170+阅读 · 2020年4月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日