We propose Bayesian methods to assess the statistical disclosure risk of data released under zero-concentrated differential privacy, focusing on settings with a strong hierarchical structure and categorical variables with many levels. Risk assessment is performed by hypothesizing Bayesian intruders with various amounts of prior information and examining the distance between their posteriors and priors. We discuss applications of these risk assessment methods to differentially private data releases from the 2020 decennial census and perform simulation studies using public individual-level data from the 1940 decennial census. Among these studies, we examine how the data holder's choice of privacy parameter affects the disclosure risk and quantify the increase in risk when a hypothetical intruder incorporates substantial amounts of hierarchical information.
翻译:我们建议采用贝叶斯方法,评估在零集中差异隐私权下发布的数据的统计披露风险,重点是等级结构严密和多层次绝对变量的环境; 由事先掌握各种信息并检查其子孙和前辈之间距离的假设贝叶斯入侵者进行风险评估; 我们讨论采用这些风险评估方法对2020年十年期人口普查的私人数据发布进行差别化,并利用1940年十年期人口普查的公开个人数据进行模拟研究; 在这些研究中,我们研究数据持有人对隐私参数的选择如何影响披露风险,并在假设入侵者包含大量等级信息时量化风险的增加。