Machine learning models hold significant potential for predicting in-hospital mortality, yet data privacy constraints and the statistical heterogeneity of real-world clinical data often hamper their development. Federated Learning (FL) offers a privacy-preserving solution, but its performance under non-Independent and Identically Distributed (non-IID) and imbalanced conditions requires rigorous investigation. The study presents a comparative benchmark of five federated learning strategies: FedAvg, FedProx, FedAdagrad, FedAdam, and FedCluster for mortality prediction. Using the large-scale MIMIC-IV dataset, we simulate a realistic non-IID environment by partitioning data by clinical care unit. To address the inherent class imbalance of the task, the SMOTE-Tomek technique is applied to each client's local training data. Our experiments, conducted over 50 communication rounds, reveal that the regularization-based strategy, FedProx, consistently outperformed other methods, achieving the highest F1-Score of 0.8831 while maintaining stable convergence. While the baseline FedAvg was the most computationally efficient, its predictive performance was substantially lower. Our findings indicate that regularization-based FL algorithms like FedProx offer a more robust and effective solution for heterogeneous and imbalanced clinical prediction tasks than standard or server-side adaptive aggregation methods. The work provides a crucial empirical benchmark for selecting appropriate FL strategies for real-world healthcare applications.
翻译:机器学习模型在预测院内死亡率方面具有巨大潜力,但数据隐私限制以及真实世界临床数据的统计异质性常常阻碍其发展。联邦学习(FL)提供了一种隐私保护的解决方案,但其在非独立同分布(non-IID)和不平衡条件下的性能需要严格评估。本研究对五种联邦学习策略——FedAvg、FedProx、FedAdagrad、FedAdam和FedCluster——在死亡率预测任务上进行了比较基准测试。利用大规模MIMIC-IV数据集,我们通过按临床护理单元划分数据来模拟真实的非独立同分布环境。为应对任务固有的类别不平衡问题,我们在每个客户端的本地训练数据上应用了SMOTE-Tomek技术。经过50轮通信的实验表明,基于正则化的策略FedProx持续优于其他方法,在保持稳定收敛的同时获得了最高的F1分数(0.8831)。虽然基准方法FedAvg计算效率最高,但其预测性能显著较低。我们的研究结果表明,对于异构且不平衡的临床预测任务,基于正则化的联邦学习算法(如FedProx)比标准方法或服务器端自适应聚合方法提供了更稳健有效的解决方案。本工作为现实世界医疗健康应用中选择合适的联邦学习策略提供了重要的实证基准。