Federated learning (FL) is a decentralized method enabling hospitals to collaboratively learn a model without sharing private patient data for training. In FL, participant hospitals periodically exchange training results rather than training samples with a central server. However, having access to model parameters or gradients can expose private training data samples. To address this challenge, we adopt secure multiparty computation (SMC) to establish a privacy-preserving federated learning framework. In our proposed method, the hospitals are divided into clusters. After local training, each hospital splits its model weights among other hospitals in the same cluster such that no single hospital can retrieve other hospitals' weights on its own. Then, all hospitals sum up the received weights, sending the results to the central server. Finally, the central server aggregates the results, retrieving the average of models' weights and updating the model without having access to individual hospitals' weights. We conduct experiments on a publicly available repository, The Cancer Genome Atlas (TCGA). We compare the performance of the proposed framework with differential privacy and federated averaging as the baseline. The results reveal that compared to differential privacy, our framework can achieve higher accuracy with no privacy leakage risk at a cost of higher communication overhead.
翻译:联邦学习(FL)是一种分散的方法,使医院能够合作学习模式,而不必分享私人病人的培训数据。在FL,参加医院定期交换培训结果,而不是用中央服务器对样本进行培训。然而,获得模型参数或梯度可以暴露私人培训数据样本。为了应对这一挑战,我们采用了安全的多功能计算(SMC),以建立一个隐私保护联邦学习框架。按照我们提议的方法,医院分为几组。在当地培训之后,每家医院将其模型重量与同一组别中的其他医院分开,这样,任何一家医院都无法单独取回其他医院的重量。然后,所有医院都对所得到的重量进行总结,将结果发送到中央服务器。最后,中央服务器将结果汇总起来,检索模型重量的平均值,更新模型模型,不使用个别医院的重量。我们在一个公开存放处进行实验,即癌症基因组地图集(TCGA)。我们比较了拟议框架的性能与差异性隐私权作比较,以平均为基准。结果显示,与差异性隐私权相比,我们的框架可以实现更高的准确性,没有隐私风险。