In this paper, we study local information privacy (LIP), and design LIP based mechanisms for statistical aggregation while protecting users' privacy without relying on a trusted third party. The notion of context-awareness is incorporated in LIP, which can be viewed as explicit modeling of the adversary's background knowledge. It enables the design of privacy-preserving mechanisms leveraging the prior distribution, which can potentially achieve a better utility-privacy tradeoff than context-free notions such as Local Differential Privacy (LDP). We present an optimization framework to minimize the mean square error in the data aggregation while protecting the privacy of each individual user's input data or a correlated latent variable while satisfying LIP constraints. Then, we study two different types of applications: (weighted) summation and histogram estimation and derive the optimal context-aware data perturbation parameters for each case, based on randomized response type of mechanism. We further compare the utility-privacy tradeoff between LIP and LDP and theoretically explain why the incorporation of prior knowledge enlarges feasible regions of the perturbation parameters, which thereby leads to higher utility. We also extend the LIP-based privacy mechanisms to the more general case when exact prior knowledge is not available. Finally, we validate our analysis by simulations using both synthetic and real-world data. Results show that our LIP-based privacy mechanism provides better utility-privacy tradeoffs than LDP, and the advantage of LIP is even more significant when the prior distribution is more skewed.
翻译:在本文中,我们研究当地信息隐私(LIP),并设计基于LIP的统计汇总机制,同时不依赖信任的第三方而保护用户隐私。背景意识的概念被纳入LIP,这可以被视为对对手背景知识的明显模型。它能够设计利用先前分发方式的隐私保护机制,这有可能实现比地方差异隐私(LDP)等无背景概念更好的效用-隐私权衡。我们提出了一个优化框架,以尽量减少数据汇总中的平均平方差,同时保护每个用户输入数据的隐私或相关的潜伏变量,同时满足LIP的限制。然后,我们研究两种不同类型的应用:(加权)比较和直方图估计,并根据随机响应机制类型,为每个案例得出最佳的环境认知数据渗透参数。我们进一步比较了LIP和LDP(LDP)之间的实用性权衡。我们从理论上解释先前知识的整合为何扩大了渗透参数的可行区域,从而导致更高的效用。我们研究两种不同的应用类型:(加权)加和直方图估计,根据随机的LIP(LIP)分析结果,我们更准确地验证了我们先前的保密性数据,这是我们更准确性分析的。