Federated Learning has gained popularity among medical institutions since it enables collaborative training between clients (e.g., hospitals) without aggregating data. However, due to the high cost associated with creating annotations, especially for large 3D image datasets, clinical institutions do not have enough supervised data for training locally. Thus, the performance of the collaborative model is subpar under limited supervision. On the other hand, large institutions have the resources to compile data repositories with high-resolution images and labels. Therefore, individual clients can utilize the knowledge acquired in the public data repositories to mitigate the shortage of private annotated images. In this paper, we propose a federated few-shot learning method with dual knowledge distillation. This method allows joint training with limited annotations across clients without jeopardizing privacy. The supervised learning of the proposed method extracts features from limited labeled data in each client, while the unsupervised data is used to distill both feature and response-based knowledge from a national data repository to further improve the accuracy of the collaborative model and reduce the communication cost. Extensive evaluations are conducted on 3D magnetic resonance knee images from a private clinical dataset. Our proposed method shows superior performance and less training time than other semi-supervised federated learning methods. Codes and additional visualization results are available at https://github.com/hexiaoxiao-cs/fedml-knee.
翻译:联邦学习在医疗机构中变得越来越流行,因为它可以在不聚合数据的情况下实现客户(例如医院)之间的协作训练。然而,由于创建注释的高成本,尤其是针对大规模三维图像数据集,临床机构缺乏足够的监督数据进行本地训练。因此,合作模型的性能在有限的监督下是次优的。另一方面,大型机构拥有编译高分辨率图像和标签的数据库资源。因此,各个客户端可以利用从公共数据库获取的知识来缓解私有注释图像的短缺。本文提出了一种联邦复读学习方法,采用双重知识蒸馏。该方法允许跨客户端进行有限注释的联合训练,而不会危及隐私。所提出的方法的监督学习从每个客户端的有限标记数据中提取特征,而无监督数据用于从全国数据库中蒸馏特征和响应(response)基础知识,以进一步提高协作模型的准确性和降低通信成本。我们在来自私有临床数据集的三维磁共振膝部影像上进行了广泛的评估。我们提出的方法显示出比其他半监督联邦学习方法更优异的性能和更短的训练时间。代码和其他可视化结果可在 https://github.com/hexiaoxiao-cs/fedml-knee 上获得。