Spatial aggregation with respect to a population distribution involves estimating aggregate quantities for a population based on an observation of individuals in a subpopulation. In this context, a geostatistical workflow must account for three major sources of `aggregation error': aggregation weights, fine scale variation, and finite population variation. However, common practice is to treat the unknown population distribution as a known population density and ignore empirical variability in outcomes. We improve common practice by introducing a `sampling frame model' that allows aggregation models to account for the three sources of aggregation error simply and transparently. We compare the proposed and the traditional approach using two simulation studies that mimic neonatal mortality rate (NMR) data from the 2014 Kenya Demographic and Health Survey (KDHS2014). For the traditional approach, undercoverage/overcoverage depends arbitrarily on the aggregation grid resolution, while the new approach exhibits low sensitivity. The differences between the two aggregation approaches increase as the population of an area decreases. The differences are substantial at the second administrative level and finer, but also at the first administrative level for some population quantities. We find differences between the proposed and traditional approach are consistent with those we observe in an application to NMR data from the KDHS2014.
翻译:人口分布方面的空间汇总涉及根据子人口个人观察结果估算人口总量。在这方面,地理统计工作流程必须说明“合并错误”的三大主要来源:“汇总权重、细规模差异和有限人口差异”。然而,通常的做法是将未知人口分布视为已知人口密度,忽视结果上的经验差异。我们改进了共同做法,采用了“抽样框架模型模型”,使汇总模型能够简单和透明地说明三个合并误差源。我们用2014年肯尼亚人口和健康调查(KDHS2014)中模拟新生儿死亡率数据的两个模拟研究对拟议方法和传统方法进行了比较。对于传统方法而言,秘密/过度覆盖取决于集成网的分辨率,而新方法显示敏感度低。两种汇总方法之间的差异随着一个地区人口减少而增加。在第二行政级别和细度上存在巨大差异,但在一些人口数量上也存在第一级行政差异。我们发现,拟议方法与传统方法之间的差异与我们在应用2014年人口和健康调查数据时观察到的方法是一致的。