医疗数据不同母联邦生存分析中的实际挑战 (Practical Challenges in Differentially-Private Federated Survival Analysis of Medical Data)

Survival analysis or time-to-event analysis aims to model and predict the time it takes for an event of interest to happen in a population or an individual. In the medical context this event might be the time of dying, metastasis, recurrence of cancer, etc. Recently, the use of neural networks that are specifically designed for survival analysis has become more popular and an attractive alternative to more traditional methods. In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of these models. This is crucial in the medical domain since data is scarce and collaboration of multiple health centers is essential to make a conclusive decision about the properties of a treatment or a disease. To ensure the privacy of the datasets, it is common to utilize differential privacy on top of federated learning. Differential privacy acts by introducing random noise to different stages of training, thus making it harder for an adversary to extract details about the data. However, in the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge. To address this problem, we propose DPFed-post which adds a post-processing stage to the private federated learning scheme. This extra step helps to regulate the magnitude of the noisy average parameter update and easier convergence of the model. For our experiments, we choose 3 real-world datasets in the realistic setting when each health center has only a few hundred records, and we show that DPFed-post successfully increases the performance of the models by an average of up to $17\%$ compared to the standard differentially private federated learning scheme.

翻译：生存分析或时间到时间分析旨在模拟和预测人们或个人感兴趣的事件发生的时间。在医学方面,这一事件可能是死亡、转移、癌症复发的时间。最近,使用专门设计用于生存分析的神经网络越来越受欢迎,是较传统方法的一种有吸引力的替代方法。在本文中,我们利用神经网络的内在特性来联合这些模型的培训过程。这在医疗领域至关重要,因为数据稀少,而多个保健中心的合作对于就治疗或疾病的性质作出结论性决定至关重要。为了确保数据集的隐私,通常的做法是在联邦化学习的顶端使用不同的隐私。不同隐私行为在不同的培训阶段引入随机噪音,从而使对手更难获取数据的细节。然而,在小型医疗数据集和少数数据中心的现实设置中,这种噪音使得模型更难融合。为了解决这个问题,我们建议DPFFD 模型比平均增加私人治疗或疾病增加私人治疗的特性。我们建议每一步都更方便地更新一个正常的实验阶段。我们用最慢的顺序来更新一个更精确的实验阶段。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

专知会员服务

17+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日