Comparing outcomes across hospitals, often to identify underperforming hospitals, is a critical task in health services research. However, naive comparisons of average outcomes, such as surgery complication rates, can be misleading because hospital case mixes differ -- a hospital's overall complication rate may be lower due to more effective treatments or simply because the hospital serves a healthier population overall. In this paper, we develop a method of ``direct standardization'' where we re-weight each hospital patient population to be representative of the overall population and then compare the weighted averages across hospitals. Adapting methods from survey sampling and causal inference, we find weights that directly control for imbalance between the hospital patient mix and the target population, even across many patient attributes. Critically, these balancing weights can also be tuned to preserve sample size for more precise estimates. We also derive principled measures of statistical precision, and use outcome modeling and Bayesian shrinkage to increase precision and account for variation in hospital size. We demonstrate these methods using claims data from Pennsylvania, Florida, and New York, estimating standardized hospital complication rates for general surgery patients. We conclude with a discussion of how to detect low performing hospitals.
翻译:比较各医院的结果,往往是为了查明表现不佳的医院,这是保健服务研究的一项关键任务。然而,对诸如手术并发症率等平均结果进行天真的比较,可能会误导人,因为医院病例混杂不同 -- -- 医院的总体复杂率可能由于更有效的治疗而降低,或者仅仅因为医院为总体更健康的人口提供服务而降低。在本文中,我们制定了一种“直接标准化”的方法,我们在这个方法中将每个住院病人人口重新加权,使其代表整个人口,然后比较各医院的加权平均数。从调查抽样和因果推断中调整方法,我们发现对医院病人混合与目标人口之间不平衡的直接控制权重,甚至在许多病人属性之间也是如此。关键的一点是,这些平衡权重还可以调整,以保持抽样大小,以获得更精确的估计。我们还从统计精确度上得出了原则性衡量标准,并使用结果模型和巴耶斯的缩缩度来提高医院规模的精确度和计算。我们用宾夕法尼亚、佛罗里达和纽约的索赔数据来证明这些方法,评估普通外科病人的标准化医院并发症率。我们最后讨论了如何检测低级医院。