LDP (Local Differential Privacy) has recently attracted much attention as a metric of data privacy that prevents the inference of personal data from obfuscated data in the local model. However, there are scenarios in which the adversary needs to perform re-identification attacks to link the obfuscated data to users in this model. LDP can cause excessive obfuscation and destroy the utility in these scenarios, because it is not designed to directly prevent re-identification. In this paper, we propose a privacy metric which we call the PIE (Personal Information Entropy). The PIE is designed so that it directly prevents re-identification attacks in the local model. It lower-bounds the lowest possible re-identification error probability (i.e., Bayes error probability) of the adversary. We analyze the relation between LDP and the PIE, and analyze the PIE and utility in distribution estimation for two obfuscation mechanisms providing LDP. Through experiments, we show that LDP fails to guarantee meaningful privacy and utility in distribution estimation. Then we show that the PIE can be used to guarantee low reidentification risks for the local obfuscation mechanisms while keeping high utility.
翻译:本地差异隐私(LDP)最近作为数据隐私的衡量标准引起了人们的极大注意,它防止了个人数据从当地模型中模糊的数据中推断出,然而,有些情况是,对手需要进行重新识别攻击,以便将模糊的数据与该模型中的用户联系起来。LDP可能造成过度混淆,并破坏这些假设情景中的效用,因为其设计目的不是为了直接防止重新识别。在本文中,我们提出了一个我们称之为PIE(个人信息 Entropy)的隐私衡量标准。PIE的设计是为了直接防止当地模型中的重新识别攻击。它缩小了对对手进行重新识别的最小误差概率(即Bayes误差概率)。我们分析了LDP与PIE之间的关系,分析了提供LDP的两个模糊机制的分配估计的效用。我们通过实验表明,LDP未能保证在分配估计中有意义的隐私和效用。然后我们表明,在保持高效用机制的同时,PIE可以使用低的重新识别风险。