Fundamental limits and optimal mechanisms of privacy-preserving data release that aim to minimize the privacy leakage under utility constraints of non-specific tasks are investigated. While the private feature is typically determined and known by the users who release their data, the specific task in which the released data is utilized is usually unknown. To address the lack of information of the specific task, utility constraints laid on a set of possible tasks are considered. The mechanism protects the privacy while satisfying utility of all possible tasks in the set. First, the single-letter characterization of the rate-leakage-distortion region is derived. Characterization of the minimum leakage under log-loss distortion and unconstrained released rate turns out to be a non-convex problem. Second, focusing on the case where the raw data consists of independent components, we show that the above problem can be decomposed into multiple parallel privacy funnel (PF) problems with different weightings. We explicitly derive the solution to each PF problem when the private feature is a deterministic function of a data component. The solution is characterized by a leakage-free threshold, and the minimum leakage is zero while the utility constraint is below the threshold. Finally, we show that the optimal weighting of each PF problem can be found by solving a linear program (LP). A sufficient released rate to achieve the minimum leakage is also derived. Numerical results are shown to illustrate the robustness of our approach against the task non-specificity. Our results can also be extended to the differential privacy metric. We show that the problem with multiple constraints can also be decomposed into multiple single-constraint problems. However, the solution to each single-constraint problem remains open and can only be solved by numerical methods.
翻译:隐私保存数据发布的基本限制和最佳机制,目的是在非特定任务的效用限制下尽量减少隐私泄露。虽然私人特征通常由发布其数据的用户确定和知道,但使用发布数据的具体任务通常并不为人所知。为解决特定任务的信息缺乏问题,考虑对一组可能的任务设置的效用限制。这个机制既保护隐私,又满足整套任务中所有可能的任务的效用。首先,得出了利率漏损-扭曲区域单一字母特性。在记录失真和未受限制的释放率下最小渗漏的特性,结果显示为非碳化问题。第二,侧重于原始数据包含独立组成部分的情况,我们表明上述问题可以分解为多个平行的隐私漏斗(PF)问题,同时满足所有可能的任务。当私人特性是数据组成部分的确定性功能时,我们明确提出每个私人问题的解决方案的特征是无渗漏阈值,而最小渗漏率是最小的,最小的泄漏率则是非碳化的。最后,我们通过一个最小的线性限制可以显示为最小的内值。我们通过一个最小的流程来说明我们最差的流程,最后显示我们发现我们最难的内值。