This study presents a machine learning (ML) pipeline for clinical data classification in the context of a 30-day readmission problem, along with a fairness audit on subgroups based on sensitive attributes. A range of ML models are used for classification and the fairness audit is conducted on the model predictions. The fairness audit uncovers disparities in equal opportunity, predictive parity, false positive rate parity, and false negative rate parity criteria on the MIMIC III dataset based on attributes such as gender, ethnicity, language, and insurance group. The results identify disparities in the model's performance across different groups and highlights the need for better fairness and bias mitigation strategies. The study suggests the need for collaborative efforts among researchers, policymakers, and practitioners to address bias and fairness in artificial intelligence (AI) systems.
翻译:本研究提出了一种针对30天再入院问题的临床数据分类的机器学习(ML)管道,并对基于敏感属性的子组进行了公平审计。在分类过程中,采用了各种ML模型,并对模型预测进行了公平审计。公平审计揭示了基于诸如性别、种族、语言和保险组等属性的MIMIC III数据集上的机会平等、预测平等、假阳性率平等和假阴性率平等标准的差异。结果揭示了不同群体之间的模型性能差异,并强调了更好的公平和偏差缓解策略的必要性。该研究建议研究人员、决策者和从业者之间进行协作努力,以解决人工智能(AI)系统中的偏差和公平性问题。