Federated Learning (FL) aims to train a machine learning (ML) model in a distributed fashion to strengthen data privacy with limited data migration costs. It is a distributed learning framework naturally suitable for privacy-sensitive medical imaging datasets. However, most current FL-based medical imaging works assume silos have ground truth labels for training. In practice, label acquisition in the medical field is challenging as it often requires extensive labor and time costs. To address this challenge and leverage the unannotated data silos to improve modeling, we propose an alternate training-based framework, Federated Alternate Training (FAT), that alters training between annotated data silos and unannotated data silos. Annotated data silos exploit annotations to learn a reasonable global segmentation model. Meanwhile, unannotated data silos use the global segmentation model as a target model to generate pseudo labels for self-supervised learning. We evaluate the performance of the proposed framework on two naturally partitioned Federated datasets, KiTS19 and FeTS2021, and show its promising performance.
翻译:联邦学习旨在以分布式方式训练机器学习(ML)模型,从而加强数据隐私并降低数据迁移成本。这是一个天然适用于隐私敏感型医学影像数据集的分布式学习框架。然而,目前大多数基于联邦学习的医学影像工作都假设,各个数据集库拥有用于训练的地面真实标签。实际上,医学领域的标签获取极具挑战性,因为往往需要大量的劳动力和时间成本。为了应对这一挑战,利用非标记数据集库来改进建模,我们提出了一种基于替代训练的框架,即联邦交替训练(FAT),在标注数据集库和非标记数据集库之间进行训练交替。标注数据集库利用注释来学习合理的全局分割模型。而非标记数据集库则使用全局分割模型作为目标模型,生成自我监督学习的伪标签。我们评估了这种方法在两个自然分区的联邦数据集KiTS19和FeTS2021上的性能,并展示了其有希望的表现。