Federated learning (FL) enables building robust and generalizable AI models by leveraging diverse datasets from multiple collaborators without centralizing the data. We created NVIDIA FLARE as an open-source software development kit (SDK) to make it easier for data scientists to use FL in their research and real-world applications. The SDK includes solutions for state-of-the-art FL algorithms and federated machine learning approaches, which facilitate building workflows for distributed learning across enterprises and enable platform developers to create a secure, privacy-preserving offering for multiparty collaboration utilizing homomorphic encryption or differential privacy. The SDK is a lightweight, flexible, and scalable Python package. It allows researchers to apply their data science workflows in any training libraries (PyTorch, TensorFlow, XGBoost, or even NumPy) in real-world FL settings. This paper introduces the key design principles of NVFlare and illustrates some use cases (e.g., COVID analysis) with customizable FL workflows that implement different privacy-preserving algorithms. Code is available at https://github.com/NVIDIA/NVFlare.
翻译:联合学习(FL)可以利用来自多个合作者的不同数据集构建健壮且具有普适性的AI模型,而无需将数据集集中管理。我们创建了 NVIDIA FLARE 作为一个开源的软件开发工具包(SDK),以便让数据科学家更轻松地在研究和实际应用中使用FL。该SDK包括最先进的FL算法和联合机器学习方法的解决方案,这些方法有助于构建跨企业的分布式学习工作流,并使平台开发人员能够创建一个安全的、保护隐私的多方合作方案,利用同态加密或差分隐私。该SDK是一个轻量级、灵活、可扩展的Python软件包。它允许研究人员在实际的FL环境下应用数据科学工作流程,包括在任何训练库中(如PyTorch、TensorFlow、XGBoost甚至NumPy)。本文介绍了NVFlare的关键设计原则,并说明了一些可定制的FL工作流案例(如COVID分析),实现了不同的隐私保护算法。代码可以在 https://github.com/NVIDIA/NVFlare 上获得。