关于在隐私限制下估算和测试的统计复杂性 (On the Statistical Complexity of Estimation and Testing under Privacy Constraints)

Producing statistics that respect the privacy of the samples while still maintaining their accuracy is an important topic of research. We study minimax lower bounds when the class of estimators is restricted to the differentially private ones. In particular, we show that characterizing the power of a distributional test under differential privacy can be done by solving a transport problem. With specific coupling constructions, this observation allows us to derivate Le Cam-type and Fano-type inequalities for both regular definitions of differential privacy and for divergence-based ones (based on Renyi divergence). We then proceed to illustrate our results on three simple, fully worked out examples. In particular, we show that the problem class has a huge importance on the provable degradation of utility due to privacy. For some problems, privacy leads to a provable degradation only when the rate of the privacy parameters is small enough whereas for other problem, the degradation systematically occurs under much looser hypotheses on the privacy parametters. Finally, we show that the known privacy guarantees of DP-SGLD, a private convex solver, when used to perform maximum likelihood, leads to an algorithm that is near-minimax optimal in both the sample size and the privacy tuning parameters of the problem for a broad class of parametric estimation procedures that includes exponential families.

翻译：在尊重样本隐私的同时,仍保持其准确性,这是一个重要的研究主题。我们研究的是当测算者类别仅限于不同的私人类别时,小低限值的低限值。特别是,我们表明,只有解决运输问题,才能在有差异的隐私下将分配测试的力量定性为分散性测试,通过具体的连接结构,这一观察使我们得以得出Le Cam类型和Fano类型的不平等,既针对不同隐私的常规定义,也针对基于差异的(基于Renyi差异)的常规定义。然后我们用三个简单、全面制定的例子来说明我们的结果。我们特别表明,问题类别对于因隐私而可变的效用退化具有极大的重要性。对于某些问题,只有当隐私参数的速率足够小时,隐私才能导致可变的退化,而对于其他问题,退化则系统发生在隐私代表器的更为松散的假设之下。最后,我们表明,已知的DP-SGLD的隐私保障,一个私人解算器,在使用最大可能性时,可以导致对接近度的精确度参数进行广义的测算,包括了接近指数的精确度估计程序。