Survival analysis or time-to-event analysis aims to model and predict the time it takes for an event of interest to happen in a population or an individual. In the medical context this event might be the time of dying, metastasis, recurrence of cancer, etc. Recently, the use of neural networks that are specifically designed for survival analysis has become more popular and an attractive alternative to more traditional methods. In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of these models. This is crucial in the medical domain since data is scarce and collaboration of multiple health centers is essential to make a conclusive decision about the properties of a treatment or a disease. To ensure the privacy of the datasets, it is common to utilize differential privacy on top of federated learning. Differential privacy acts by introducing random noise to different stages of training, thus making it harder for an adversary to extract details about the data. However, in the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge. To address this problem, we propose DPFed-post which adds a post-processing stage to the private federated learning scheme. This extra step helps to regulate the magnitude of the noisy average parameter update and easier convergence of the model. For our experiments, we choose 3 real-world datasets in the realistic setting when each health center has only a few hundred records, and we show that DPFed-post successfully increases the performance of the models by an average of up to $17\%$ compared to the standard differentially private federated learning scheme.
翻译:生存分析或时间到时间分析旨在模拟和预测人们或个人感兴趣的事件发生的时间。在医学方面,这一事件可能是死亡、转移、癌症复发的时间。最近,使用专门设计用于生存分析的神经网络越来越受欢迎,是较传统方法的一种有吸引力的替代方法。在本文中,我们利用神经网络的内在特性来联合这些模型的培训过程。这在医疗领域至关重要,因为数据稀少,而多个保健中心的合作对于就治疗或疾病的性质作出结论性决定至关重要。为了确保数据集的隐私,通常的做法是在联邦化学习的顶端使用不同的隐私。不同隐私行为在不同的培训阶段引入随机噪音,从而使对手更难获取数据的细节。然而,在小型医疗数据集和少数数据中心的现实设置中,这种噪音使得模型更难融合。为了解决这个问题,我们建议DPFFD 模型比平均增加私人治疗或疾病增加私人治疗的特性。我们建议每一步都更方便地更新一个正常的实验阶段。我们用最慢的顺序来更新一个更精确的实验阶段。