差异隐私和高层面的可靠统计数据 (Differential privacy and robust statistics in high dimensions)

We introduce a universal framework for characterizing the statistical efficiency of a statistical estimation problem with differential privacy guarantees. Our framework, which we call High-dimensional Propose-Test-Release (HPTR), builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism. Gluing all these together is the concept of resilience, which is central to robust statistical estimation. Resilience guides the design of the algorithm, the sensitivity analysis, and the success probability analysis of the test step in Propose-Test-Release. The key insight is that if we design an exponential mechanism that accesses the data only via one-dimensional robust statistics, then the resulting local sensitivity can be dramatically reduced. Using resilience, we can provide tight local sensitivity bounds. These tight bounds readily translate into near-optimal utility guarantees in several cases. We give a general recipe for applying HPTR to a given instance of a statistical estimation problem and demonstrate it on canonical problems of mean estimation, linear regression, covariance estimation, and principal component analysis. We introduce a general utility analysis technique that proves that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.

翻译：我们引入了一个通用框架,用不同的隐私保障来说明统计估算问题的统计效率。我们称之为高维的演示-测试-释放(HPTR)的框架基于三个关键组成部分:指数机制、强有力的统计数据和提议-测试-释放机制。将所有这些概念结合在一起,就是弹性概念,这是稳健统计估计的核心。复原力指导算法的设计、敏感性分析以及提议-测试-发布测试步骤的成功概率分析。关键见解是,如果我们设计一个指数机制,仅通过一维的稳健统计数据获取数据,然后可以大幅降低由此产生的本地敏感度。我们利用复原力,可以提供严格的本地敏感度界限。这些紧紧的界限可以在某些情况下转化为近乎最佳的效用保障。我们给将HPTR应用于某个特定统计估计问题的例子提供了一种一般的配方,并展示了中值估计、线性回归、常数估计和主要组成部分分析等几大问题。我们引入了一种一般的实用性分析技术,证明HPTR几乎达到了文献所研究的一些情景下的最佳样本复杂性。