In "Differential Perspectives: Epistemic Disconnects Surrounding the US Census Bureau's Use of Differential Privacy," boyd and Sarathy argue that empirical evaluations of the Census Disclosure Avoidance System (DAS), including our published analysis, failed to recognize how the benchmark data against which the 2020 DAS was evaluated is never a ground truth of population counts. In this commentary, we explain why policy evaluation, which was the main goal of our analysis, is still meaningful without access to a perfect ground truth. We also point out that our evaluation leveraged features specific to the decennial Census and redistricting data, such as block-level population invariance under swapping and voter file racial identification, better approximating a comparison with the ground truth. Lastly, we show that accurate statistical predictions of individual race based on the Bayesian Improved Surname Geocoding, while not a violation of differential privacy, substantially increases the disclosure risk of private information the Census Bureau sought to protect. We conclude by arguing that policy makers must confront a key trade-off between data utility and privacy protection, and an epistemic disconnect alone is insufficient to explain disagreements between policy choices.
翻译:Boyd和Sarathy认为,对人口普查披露避免系统(DAS)的实证评估,包括我们发表的分析,都未能认识到,对2020年DAS进行评估的基准数据从来就不是人口统计的根据。 在本评注中,我们解释了为什么作为我们分析主要目标的政策评估,在没有获得完美地面真相的情况下,仍然有意义。我们还指出,我们的评估利用了十年人口普查和重新划分数据的具体特点,如交换中区层人口变化和选民档案中的种族识别,更好地与地面真相进行比较。 最后,我们表明,基于巴伊西亚改进的Surname地理编码对个别种族的准确统计预测,虽然没有侵犯不同的隐私,但极大地增加了人口普查局要保护的私人信息披露风险。我们的结论是,决策者必须面对数据效用和隐私保护之间的关键贸易差异,而仅是缩写式脱节,这不足以解释政策之间的分歧。