Survival analysis is a subfield of statistics concerned with modeling the occurrence time of a particular event of interest for a population. Survival analysis found widespread applications in healthcare, engineering, and social sciences. However, real-world applications involve survival datasets that are distributed, incomplete, censored, and confidential. In this context, federated learning can tremendously improve the performance of survival analysis applications. Federated learning provides a set of privacy-preserving techniques to jointly train machine learning models on multiple datasets without compromising user privacy, leading to a better generalization performance. Despite the widespread development of federated learning in recent AI research, only a few studies focus on federated survival analysis. In this work, we present a novel federated algorithm for survival analysis based on one of the most successful survival models, the random survival forest. We call the proposed method Federated Survival Forest (FedSurF). With a single communication round, FedSurF obtains a discriminative power comparable to deep-learning-based federated models trained over hundreds of federated iterations. Moreover, FedSurF retains all the advantages of random forests, namely low computational cost and natural handling of missing values and incomplete datasets. These advantages are especially desirable in real-world federated environments with multiple small datasets stored on devices with low computational capabilities. Numerical experiments compare FedSurF with state-of-the-art survival models in federated networks, showing how FedSurF outperforms deep-learning-based federated algorithms in realistic environments with non-identically distributed data.
翻译:生存分析是一个与模拟对人口感兴趣的特定事件的发生时间有关的统计的子领域。生存分析发现,在保健、工程和社会科学中应用了广泛的医疗、工程和社会科学;然而,现实世界应用涉及分布、不完整、受检查和保密的生存数据集。在这方面,联合会学习可以极大地改善生存分析应用的性能。联合会学习提供了一套隐私保护技术,用于在不损及用户隐私的情况下联合培训多套数据集的机器学习模型,从而导致更好的普及性表现。尽管在最近的AI研究中广泛开展了联邦化学习,但只有少数研究侧重于联邦化的深层生存分析。在这项工作中,我们根据最成功的生存模型之一,即随机生存森林,我们称之为拟议方法Fedsurforld Form(FedSurF),通过单一的交流周期,FedSurF获得了一种与基于深学习的联邦化的联邦化模型相比的差别性力量。此外,FedSFrederF保留了非联邦性生存分析网络的所有非联邦化生存分析方法。我们根据一种最成功的生存模型,即低计算成本和低联邦化的联邦化的货币计算方法,在不完善的统计环境中,在不完善的精确的统计中,在不完善的精确的计算中,在不完善的模型中,在不完善的计算方法中,在不完善的模型的模型中,在不完善的计算中,在不完善的模型的计算方法的模型的模型的模型的计算能力中,在不完善的计算能力中,这些模型中,这些模型中,这些模型具有了所有。