As machine learning (ML) continue to be integrated into healthcare systems that affect clinical decision making, new strategies will need to be incorporated in order to effectively detect and evaluate subgroup disparities to ensure accountability and generalizability in clinical workflows. In this paper, we explore how epistemic uncertainty can be used to evaluate disparity in patient demographics (race) and data acquisition (scanner) subgroups for breast density assessment on a dataset of 108,190 mammograms collected from 33 clinical sites. Our results show that even if aggregate performance is comparable, the choice of uncertainty quantification metric can significantly the subgroup level. We hope this analysis can promote further work on how uncertainty can be leveraged to increase transparency of machine learning applications for clinical deployment.
翻译:由于机器学习(ML)继续被纳入影响临床决策的保健系统,需要纳入新的战略,以便有效检测和评价分组差异,确保临床工作流程的问责制和可普遍性,在本文件中,我们探讨如何利用认知不确定性来评价病人人口(种族)和数据采集(扫描)分组的差异,以便评估在33个临床地点收集的108 190个乳房X光片数据集中的乳房密度。我们的结果表明,即使总体性能具有可比性,不确定性量化指标的选择也能够大大加强分组一级。我们希望这一分析能够推动进一步开展工作,研究如何利用不确定性提高临床部署机器学习应用的透明度。