Whole slide image (WSI) analysis has emerged as an increasingly essential technique in computational pathology. Recent advances in the pathological foundation models (FMs) have demonstrated significant advantages in deriving meaningful patch-level or slide-level feature representations from WSIs. However, current pathological FMs have exhibited substantial heterogeneity caused by diverse private training datasets and different network architectures. This heterogeneity introduces performance variability when we utilize the extracted features from different FMs in the downstream tasks. To fully explore the advantage of multiple FMs effectively, in this work, we propose a novel framework for the fusion of heterogeneous pathological FMs, called FuseCPath, yielding a model with a superior ensemble performance. The main contributions of our framework can be summarized as follows: (i) To guarantee the representativeness of the training patches, we propose a multi-view clustering-based method to filter out the discriminative patches via multiple FMs' embeddings. (ii) To effectively fuse the heterogeneous patch-level FMs, we devise a cluster-level re-embedding strategy to online capture patch-level local features. (iii) To effectively fuse the heterogeneous slide-level FMs, we devise a collaborative distillation strategy to explore the connections between slide-level FMs. Extensive experiments conducted on lung cancer, bladder cancer, and colorectal cancer datasets from The Cancer Genome Atlas (TCGA) have demonstrated that the proposed FuseCPath achieves state-of-the-art performance across multiple tasks on these public datasets.
翻译:全切片图像(WSI)分析已成为计算病理学中日益重要的技术。病理学基础模型(FMs)的最新进展在从WSI中提取有意义的切片级或玻片级特征表示方面展现出显著优势。然而,当前病理学FMs因不同的私有训练数据集和网络架构而表现出显著的异质性。这种异质性导致在下游任务中使用不同FMs提取的特征时存在性能差异。为有效探索多个FMs的优势,本研究提出了一种名为FuseCPath的新型异质病理学FMs融合框架,该框架通过集成实现了更优的性能。我们框架的主要贡献可总结如下:(i)为确保训练切片的代表性,我们提出了一种基于多视图聚类的方法,通过多个FMs的嵌入向量筛选出具有判别性的切片。(ii)为有效融合异质切片级FMs,我们设计了一种聚类级重嵌入策略,以在线捕获切片级局部特征。(iii)为有效融合异质玻片级FMs,我们设计了一种协同蒸馏策略,以探索玻片级FMs之间的关联。在癌症基因组图谱(TCGA)的肺癌、膀胱癌和结直肠癌数据集上进行的大量实验表明,所提出的FuseCPath在这些公共数据集上的多项任务中均达到了最先进的性能。