Machine learning in medicine leverages the wealth of healthcare data to extract knowledge, facilitate clinical decision-making, and ultimately improve care delivery. However, ML models trained on datasets that lack demographic diversity could yield suboptimal performance when applied to the underrepresented populations (e.g. ethnic minorities, lower social-economic status), thus perpetuating health disparity. In this study, we evaluated four classifiers built to predict Hyperchloremia - a condition that often results from aggressive fluids administration in the ICU population - and compared their performance in racial, gender, and insurance subgroups. We observed that adding social determinants features in addition to the lab-based ones improved model performance on all patients. The subgroup testing yielded significantly different AUC scores in 40 out of the 44 model-subgroup, suggesting disparities when applying ML models to social determinants subgroups. We urge future researchers to design models that proactively adjust for potential biases and include subgroup reporting in their studies.
翻译:医学机学利用丰富的保健数据来获取知识,便利临床决策,并最终改善护理的提供。然而,在缺乏人口多样性的数据集方面受过培训的ML模型,如果适用于代表性不足的人口(例如少数民族,社会经济地位较低),则可能产生不理想的绩效,从而延续健康差异。在这项研究中,我们评估了四个为预测高氯血清而建立的分类器,这个条件往往是伊斯兰法院联盟人口对高血清的侵袭性流体管理造成的,并比较了他们在种族、性别和保险分组方面的表现。我们发现,除了实验室模型外,增加社会决定因素特征还改善了所有病人的模型性能。分组测试得出了44个模型分组中40个的ACU得分差异很大,表明在对社会决定因素分组应用ML模型时存在差异。我们敦促未来的研究人员设计出能够积极适应潜在偏差的模型,并将分组报告纳入其研究。