LDP (Local Differential Privacy) has recently attracted much attention as a metric of data privacy that prevents the inference of personal data from obfuscated data in the local model. However, there are scenarios in which the adversary wants to perform re-identification attacks to link the obfuscated data to users in this model. LDP can cause excessive obfuscation and destroy the utility in these scenarios because it is not designed to directly prevent re-identification. In this paper, we propose a measure of reidentification risks, which we call PIE (Personal Information Entropy). The PIE is designed so that it directly prevents re-identification attacks in the local model. It lower-bounds the lowest possible re-identification error probability (i.e., Bayes error probability) of the adversary. We analyze the relation between LDP and the PIE, and analyze the PIE and utility in distribution estimation for two obfuscation mechanisms providing LDP. Through experiments, we show that when we consider re-identification as a privacy risk, LDP can cause excessive obfuscation and destroy the utility. Then we show that the PIE can be used to guarantee low re-identification risks for the local obfuscation mechanisms while keeping high utility.
翻译:本地差异隐私(LDP)最近作为数据隐私的衡量标准引起了人们的极大关注,它防止了个人数据从当地模型中模糊的数据中推断出,然而,有些情况是,对手希望进行重新识别攻击,将模糊的数据与该模型中的用户联系起来。LDP可能造成过度混淆,并破坏这些假设情景中的效用,因为其设计目的不是为了直接防止重新识别。在本文中,我们提议了一种重新识别风险的尺度,我们称之为PIE(个人信息 Entropy)。PIE的设计是为了直接防止当地模型中的重新识别攻击。它将对手的尽可能最低的重新识别误差概率(即Bayes误差概率)降低。我们分析了LDP与PIE之间的关系,并分析了提供LDP的两个模糊机制的分配估计的效用。我们通过实验表明,当我们考虑重新识别为隐私风险时,LDP可以造成过度的混淆,并摧毁当地模型中的再次识别错误概率。我们随后表明,使用PIEIE的低用途识别机制可以保证使用高的当地用途。