When analyzing spatially referenced event data, the criteria for declaring rates as "reliable" is still a matter of dispute. What these varying criteria have in common, however, is that they are rarely satisfied for crude estimates in small area analysis settings, prompting the use of spatial models to improve reliability. While reasonable, recent work has quantified the extent to which popular models from the spatial statistics literature can overwhelm the information contained in the data, leading to oversmoothing. Here, we begin by providing a definition for a "reliable" estimate for event rates that can be used for crude and model-based estimates and allows for discrete and continuous statements of reliability. We then construct a spatial Bayesian framework that allows users to infuse prior information into their models to improve reliability while also guarding against oversmoothing. We apply our approach to county-level birth data from Pennsylvania, highlighting the effect of oversmoothing in spatial models and how our approach can allow users to better focus their attention to areas where sufficient data exists to drive inferential decisions. We then conclude with a brief discussion of how this definition of reliability can be used in the design of small area studies.
翻译:在分析空间引用事件数据时,宣布“可靠”事件率的标准仍是一个争议问题。然而,这些不同标准的共同点是,它们很少满足小地区分析环境中的粗估计数,从而促使使用空间模型来提高可靠性。虽然合理,但最近的工作量化了空间统计文献的流行模型能够压倒数据所含信息的程度,导致空间模型的过度覆盖效应。在这里,我们首先为可用于粗略和基于模型的估计并允许独立和连续的可靠性说明的事件率“可靠”估计提供了定义。我们随后建立了一个空间贝叶斯框架,允许用户在模型中预先输入信息,以提高可靠性,同时防范过度悬浮。我们对宾夕法尼亚州县一级的出生数据采用了我们的方法,强调了空间模型过度覆盖的影响,以及我们的方法如何使用户能够将注意力更集中到有足够数据可以推动推断决策的领域。我们最后简要地讨论了如何在设计小地区研究时使用可靠性的定义。