Computer-aided diagnosis (CAD) can help pathologists improve diagnostic accuracy together with consistency and repeatability for cancers. However, the CAD models trained with the histopathological images only from a single center (hospital) generally suffer from the generalization problem due to the straining inconsistencies among different centers. In this work, we propose a pseudo-data based self-supervised federated learning (FL) framework, named SSL-FT-BT, to improve both the diagnostic accuracy and generalization of CAD models. Specifically, the pseudo histopathological images are generated from each center, which contains inherent and specific properties corresponding to the real images in this center, but does not include the privacy information. These pseudo images are then shared in the central server for self-supervised learning (SSL). A multi-task SSL is then designed to fully learn both the center-specific information and common inherent representation according to the data characteristics. Moreover, a novel Barlow Twins based FL (FL-BT) algorithm is proposed to improve the local training for the CAD model in each center by conducting contrastive learning, which benefits the optimization of the global model in the FL procedure. The experimental results on three public histopathological image datasets indicate the effectiveness of the proposed SSL-FL-BT on both diagnostic accuracy and generalization.
翻译:计算机辅助诊断(CAD)可以帮助病理学家提高诊断准确性和癌症的连贯性和可重复性。然而,仅从一个中心(医院)用病理学图象训练的CAD模型通常会因不同中心之间的不一致性而普遍出现问题。在这项工作中,我们提议了一个以假数据为基础的自我监督的联邦学习(FL)框架,名为SSL-FT-BT,以提高CAD模型的诊断准确性和概括性。具体地说,每个中心都生成假的病理学图象,其中包含与这个中心真实图像相对应的内在和具体属性,但不包括隐私信息。这些假图象随后在中央服务器上共享,用于自我监督学习(SSL)。然后,我们设计了一个多任务SL(FL)框架,以充分了解中心特有的信息和根据数据特性所固有的共同表现。此外,一个基于FL(F-BT)的新型Barlow Temb(L-BT)算法,目的是改进每个中心CADD模型的当地培训,方法是通过进行对比性学习,显示FL(FL)一般图像的三种分析结果的模型。