Computer-aided diagnosis (CAD) can help pathologists improve diagnostic accuracy together with consistency and repeatability for cancers. However, the CAD models trained with the histopathological images only from a single center (hospital) generally suffer from the generalization problem due to the straining inconsistencies among different centers. In this work, we propose a pseudo-data based self-supervised federated learning (FL) framework, named SSL-FT-BT, to improve both the diagnostic accuracy and generalization of CAD models. Specifically, the pseudo histopathological images are generated from each center, which contains inherent and specific properties corresponding to the real images in this center, but does not include the privacy information. These pseudo images are then shared in the central server for self-supervised learning (SSL). A multi-task SSL is then designed to fully learn both the center-specific information and common inherent representation according to the data characteristics. Moreover, a novel Barlow Twins based FL (FL-BT) algorithm is proposed to improve the local training for the CAD model in each center by conducting contrastive learning, which benefits the optimization of the global model in the FL procedure. The experimental results on three public histopathological image datasets indicate the effectiveness of the proposed SSL-FL-BT on both diagnostic accuracy and generalization.
翻译:计算机辅助诊断(CAD)可以帮助病理学家提高癌症的诊断准确性和一致性,但是仅使用来自单个中心(医院)的组织病理图像训练的CAD模型通常因不同中心之间的扭曲一致性而遭受泛化问题。在本研究中,我们提出了一种基于伪数据的自监督联邦学习(FL)框架,命名为SSL-FT-BT,以提高CAD模型的诊断准确性和泛化性能。具体而言,从每个中心生成伪组织病理图像,其中包含与该中心的真实图像相对应的固有特性,但没有包含隐私信息。然后,在中央服务器上共享这些伪图像进行自监督学习(SSL)。然后,设计了一个多任务SSL,根据数据特征完全学习中心特定信息和通用固有表示。此外,提出了一种基于Barlow Twins的FL(FL-BT)算法,通过进行对比学习改善每个中心中CAD模型的本地训练,从而有益于FL过程中全局模型的优化。在三个公共组织病理图像数据集上的实验结果表明,所提出的SSL-FL-BT在诊断准确性和泛化上都具有有效性。