Federated learning (FL) is a distributed machine learning (ML) approach that allows data to be trained without being centralized. This approach is particularly beneficial for medical applications because it addresses some key challenges associated with medical data, such as privacy, security, and data ownership. On top of that, FL can improve the quality of ML models used in medical applications. Medical data is often diverse and can vary significantly depending on the patient population, making it challenging to develop ML models that are accurate and generalizable. FL allows medical data to be used from multiple sources, which can help to improve the quality and generalizability of ML models. Differential privacy (DP) is a go-to algorithmic tool to make this process secure and private. In this work, we show that the model performance can be further improved by employing local steps, a popular approach to improving the communication efficiency of FL, and tuning the number of communication rounds. Concretely, given the privacy budget, we show an optimal number of local steps and communications rounds. We provide theoretical motivations further corroborated with experimental evaluations on real-world medical imaging tasks.
翻译:联邦学习 (FL) 是一种分布式机器学习 (ML) 方法,允许数据在不集中的情况下进行训练。这种方法对医疗应用尤为有利,因为它解决了与医疗数据相关的一些关键挑战,如隐私、安全和数据所有权。此外,FL 可以提高用于医疗应用的 ML 模型的质量。医疗数据通常是多样化的,根据患者人群的不同而有很大的差异,这使得开发准确和具有泛化能力的 ML 模型具有挑战性。FL 允许多个来源的医疗数据被使用,这可以帮助提高 ML 模型的质量和泛化能力。差分隐私 (DP) 是一种常用的算法工具,可以使这个过程安全和私密。在这项工作中,我们展示了通过采用局部步骤和调整通信轮数可以进一步提高模型性能。局部步骤是提高FL通信效率的一种常用方法。具体而言,我们根据隐私预算展示了最优的局部步骤数和通信轮数。我们提供了理论方面的动机,并在真实世界的医学影像任务上得到了进一步确认。