Healthcare data in the United States often records only a patient's coarse race group: for example, both Indian and Chinese patients are typically coded as ``Asian.'' It is unknown, however, whether this coarse coding conceals meaningful disparities in the performance of clinical risk scores across granular race groups. Here we show that it does. Using data from 418K emergency department visits, we assess clinical risk score performance disparities across granular race groups for three outcomes, five risk scores, and four performance metrics. Across outcomes and metrics, we show that there are significant granular disparities in performance within coarse race categories. In fact, variation in performance metrics within coarse groups often exceeds the variation between coarse groups. We explore why these disparities arise, finding that outcome rates, feature distributions, and the relationships between features and outcomes all vary significantly across granular race categories. Our results suggest that healthcare providers, hospital systems, and machine learning researchers should strive to collect, release, and use granular race data in place of coarse race data, and that existing analyses may significantly underestimate racial disparities in performance.
翻译:美国的医疗健康数据通常只记录患者的粗略种族群体:例如,印度和中国患者通常被编码为“亚洲人”。然而,尚不清楚这种粗编码是否掩盖了种族群体之间临床风险评分表现差异的有意义差异。本文向大家展示了确实存在这样的差异。我们利用418千急诊患者数据,评估了三个结果、五个风险评分和四个表现度量之间精细种族群体之间的诊所风险评分表现差异。在不同的结果和度量标准之间,我们发现在粗略种族类别内部存在显著的精细差异。事实上,在粗略族群之间的变异之外,度量标准的变异性通常更大。我们探究了这些差异的原因,发现精细种族群体之间的结果率、特征分布以及特征和结果之间的关系都存在显著差异。我们的结果表明,医疗保健提供者,医院系统和机器学习研究人员应该努力收集,发布和使用精细种族数据代替粗略种族数据,并且现有的分析可能严重低估了表现的种族差异。