We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism is a priori unknown to him (the so called "black-box" scenario). In particular, we delve into the investigation of various notions of differential privacy (DP), namely epsilon-DP, local DP, and R\'enyi DP. On one side, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees. On the other side, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator. Then, by using one of the best known DP obfuscation mechanisms (namely, the Laplacian one), we test in practice that the theoretical number of samples needed to prove our bound is actually much larger than the real number needed for obtaining satisfactory results. Furthermore, we also see that the estimated epsilon is in practice much closer to the real one w.r.t. what our theorems foresee.
翻译:我们分析最终用户在数据模糊机制为他事先所不知时,能够推断数据保护程度的信息(所谓的“黑盒子”假设),最终用户能够在多大程度上推断出数据保护程度的信息。特别是,我们深入调查了不同隐私(DP)的各种概念,即epsilon-DP、local DP和R\'enyi DP。一方面,我们证明,如果没有对基本分布的任何假设,我们实际上无法使用一种算法来推断数据保护程度,而提供可验证的保证。另一方面,我们证明,根据合理的假设(即封闭间隔内所涉密度的利普施奇茨),这种保证的存在和可以通过一个简单的基于直方图的估测仪来实现。然后,我们通过使用最著名的DP混淆机制(即拉普拉普利亚机制),我们在实践中测试,证明我们受约束的样本的理论数量实际上比获得令人满意的结果所需的实际数字要大得多。此外,我们还看到,估计的epslon是真实的预测。