Small area estimates of population are necessary for many epidemiological studies, yet their quality and accuracy are often not assessed. In the United States, small area estimates of population counts are published by the United States Census Bureau (USCB) in the form of the Decennial census counts, Intercensal population projections (PEP), and American Community Survey (ACS) estimates. Although there are significant relationships between these data sources, there are important contrasts in data collection and processing methodologies, such that each set of estimates may be subject to different sources and magnitudes of error. Additionally, these data sources do not report identical small area population counts due to post-survey adjustments specific to each data source. Resulting small area disease/mortality rates may differ depending on which data source is used for population counts (denominator data). To accurately capture annual small area population counts, and associated uncertainties, we present a Bayesian population model (B-Pop), which fuses information from all three USCB sources, accounting for data source specific methodologies and associated errors. The main features of our framework are: 1) a single model integrating multiple data sources, 2) accounting for data source specific data generating mechanisms, and specifically accounting for data source specific errors, and 3) prediction of estimates for years without USCB reported data. We focus our study on the 159 counties of Georgia, and produce estimates for years 2005-2021.
翻译:在美国,美国人口普查局(USCB)以十年一次人口普查、人口普查人口预测和美国社区调查(ACS)的估计数的形式公布了少量地区人口数估计数。虽然这些数据来源之间有着重要的关系,但在数据收集和处理方法方面存在着重要的对比,因此每套估计数可能受到不同来源和误差程度的制约。此外,这些数据来源并不报告由于每个数据来源的特定调查后调整而出现的相同的小地区人口数。导致的小地区疾病/死亡率可能因人口计数使用的数据来源而不同(主要数据数据数据数据数据)。为准确记录每年的小地区人口数和相关不确定性,我们提出了一个巴伊西亚人口模型(B-Pop),该模型汇集了所有三个美国数据库来源的信息,其中考虑到数据来源的具体方法和相关误差。我们框架的主要特征是:1)一个单一的模型,综合了多个数据来源,2)计算小地区疾病/死亡率的比率可能因小而不同。为2005年人口计数的数据来源使用的数据来源(主要数据数据数据),为2005年具体数据源的预测,具体数据来源(我们为2005年的统计重点数据来源)。