LDP (Local Differential Privacy) has recently attracted much attention as a metric of data privacy that prevents the inference of personal data from obfuscated data in the local model. However, there are scenarios in which the adversary wants to perform re-identification attacks to link the obfuscated data to users in this model. LDP can cause excessive obfuscation and destroy the utility in these scenarios because it is not designed to directly prevent re-identification. In this paper, we propose a measure of re-identification risks, which we call PIE (Personal Information Entropy). The PIE is designed so that it directly prevents re-identification attacks in the local model. It lower-bounds the lowest possible re-identification error probability (i.e., Bayes error probability) of the adversary. We analyze the relation between LDP and the PIE, and analyze the PIE and utility in distribution estimation for two obfuscation mechanisms providing LDP. Through experiments, we show that when we consider re-identification as a privacy risk, LDP can cause excessive obfuscation and destroy the utility. Then we show that the PIE can be used to guarantee low re-identification risks for the local obfuscation mechanisms while keeping high utility.
翻译:本地差异隐私(LDP)最近作为数据隐私的衡量标准引起了人们的极大关注,它防止了个人数据从当地模型中模糊的数据中推断出,然而,有些情况是,对手希望进行重新识别攻击,将模糊的数据与该模型中的用户联系起来。 LDP可能造成过度混淆,并破坏这些假设情景中的效用,因为其设计目的不是为了直接防止重新识别。在本文中,我们提议了一种重新识别风险的尺度,我们称之为PIE(个人信息 Entropy)。PIE的设计是,直接防止当地模型中的重新识别攻击。它降低了对手进行重新识别可能的最小误差概率(即Bayes误差概率)。我们分析了LDP和PIE之间的关系,分析了提供LDP的两个模糊机制的分配估计的效用。我们通过实验表明,当我们把重新识别视为隐私风险时,LDP可以造成过度的混淆,并摧毁当地模型中的重新识别错误概率。我们随后证明,PIEIE可以保证使用低用途机制。