We consider data release protocols for data $X=(S,U)$, where $S$ is sensitive; the released data $Y$ contains as much information about $X$ as possible, measured as $\operatorname{I}(X;Y)$, without leaking too much about $S$. We introduce the Robust Local Differential Privacy (RLDP) framework to measure privacy. This framework relies on the underlying distribution of the data, which needs to be estimated from available data. Robust privacy guarantees are ensuring privacy for all distributions in a given set $\mathcal{F}$, for which we study two cases: when $\mathcal{F}$ is the set of all distributions, and when $\mathcal{F}$ is a confidence set arising from a $\chi^2$ test on a publicly available dataset. In the former case we introduce a new release protocol which we prove to be optimal in the low privacy regime. In the latter case we present four algorithms that construct RLDP protocols from a given dataset. One of these approximates $\mathcal{F}$ by a polytope and uses results from robust optimisation to yield high utility release protocols. However, this algorithm relies on vertex enumeration and becomes computationally inaccessible for large input spaces. The other three algorithms are low-complexity and build on randomised response. Experiments verify that all four algorithms offer significantly improved utility over regular LDP.
翻译:我们考虑数据 $X= (S,U) 的数据发布协议, 美元是敏感值; 释放的数据 $Y 包含尽可能多的关于美元的信息, 以美元=Operatorname{I}(X;Y) 美元衡量, 而不泄漏过多美元。 我们引入了 robust 本地差异隐私 (RLDP) 框架来测量隐私。 这个框架依赖于数据的基本分布, 这些数据需要从现有数据中估算。 robust 隐私保障正在确保特定数据集 $\mathcal{F} 的所有发行的隐私。 我们为此研究两个案例: 当$\mathcal{F} 美元是所有分配的一组, 而当$\mascalname{F} 美元是用于在公开的数据集中测试$\chi%2美元(RLLLDP) 。 在前一种情况下, 我们引入了一个新的释放协议, 这在低隐私机制中被证明是最佳的。 在后一种情况下, 我们提出了四个改进的算法, 从给定数据集构建 RLDP 的 RLDP compalalalalalalal comalation comalalalalation 。 其中之一, 将大大地使用 美元= 4 美元= 美元= 美元和高收益 美元=xilmaxilgalbalbalmaxxxxxxxx 。