Confidence intervals are a fundamental tool for quantifying the uncertainty of parameters of interest. With the increase of data privacy awareness, developing a private version of confidence intervals has gained growing attention from both statisticians and computer scientists. Differential privacy is a state-of-the-art framework for analyzing privacy loss when releasing statistics computed from sensitive data. Recent work has been done around differentially private confidence intervals, yet to the best of our knowledge, rigorous methodologies on differentially private confidence intervals in the context of survey sampling have not been studied. In this paper, we propose three differentially private algorithms for constructing confidence intervals for proportions under stratified random sampling. We articulate two variants of differential privacy that make sense for data from stratified sampling designs, analyzing each of our algorithms within one of these two variants. We establish analytical privacy guarantees and asymptotic properties of the estimators. In addition, we conduct simulation studies to evaluate the proposed private confidence intervals, and two applications to the 1940 Census data are provided.
翻译:信任间隔是量化相关参数不确定性的基本工具。随着数据隐私意识的提高,开发私人版本的信任间隔得到统计人员和计算机科学家越来越多的关注。在发布敏感数据统计时,不同的隐私是分析隐私损失的最先进的框架。最近的工作围绕不同的私人信任间隔进行,但据我们所知,还没有研究调查抽样中关于不同私人信任间隔的严格方法。在本文件中,我们提出了三种不同的私人算法,以构建分层随机抽样中比例的信任间隔。我们阐述了两种不同隐私的变式,对分层抽样设计中的数据有意义,在这两种变式中分别分析我们各自的算法。我们建立了分析隐私保障和估测器的防腐性特性。此外,我们进行了模拟研究,以评价拟议的私人信任间隔,并对1940年人口普查数据进行了两项应用。