Machine learning (ML) has become a critical tool in public health, offering the potential to improve population health, diagnosis, treatment selection, and health system efficiency. However, biases in data and model design can result in disparities for certain protected groups and amplify existing inequalities in healthcare. To address this challenge, this study summarizes seminal literature on ML fairness and presents a framework for identifying and mitigating biases in the data and model. The framework provides guidance on incorporating fairness into different stages of the typical ML pipeline, such as data processing, model design, deployment, and evaluation. To illustrate the impact of biases in data on ML models, we present examples that demonstrate how systematic biases can be amplified through model predictions. These case studies suggest how the framework can be used to prevent these biases and highlight the need for fair and equitable ML models in public health. This work aims to inform and guide the use of ML in public health towards a more ethical and equitable outcome for all populations.
翻译:机器学习(Machine Learning,ML)已经成为公共卫生中至关重要的工具,可提供改善人群健康、诊断、治疗选择和健康系统效率的潜力。然而,数据和模型设计中的偏见可能导致某些受保护群体的不公平,并放大医疗保健方面的现有不平等。为解决这一挑战,本研究总结了 ML 公平性的重要文献,并提供了一个框架来识别和缓解数据和模型中的偏见。该框架提供了关于如何在典型的 ML 流程中,如数据处理、模型设计、部署和评估中,将公平性纳入到不同阶段的指导。为了说明数据中偏见对 ML 模型的影响,我们提供了案例研究,演示系统偏见如何通过模型预测被放大。这些案例研究说明了该框架如何被用于防止这些偏见,并强调了公平和公正的 ML 模型在公共卫生中的必要性。该工作旨在为公共卫生中 ML 的使用提供信息和指导,以实现对所有人群的更加道德和公平的结果。