Advancement in digital pathology and artificial intelligence has enabled deep learning-based computer vision techniques for automated disease diagnosis and prognosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized, making them infeasible to be used directly for training deep neural networks. Hence, for modeling, a two-stage approach is adopted: Patch representations are extracted first, followed by the aggregation for WSI prediction. These approaches require detailed pixel-level annotations for training the patch encoder. However, obtaining these annotations is time-consuming and tedious for medical experts. Transfer learning is used to address this gap and deep learning architectures pre-trained on ImageNet are used for generating patch-level representation. Even though ImageNet differs significantly from histopathology data, pre-trained networks have been shown to perform impressively on histopathology data. Also, progress in self-supervised and multi-task learning coupled with the release of multiple histopathology data has led to the release of histopathology-specific networks. In this work, we compare the performance of features extracted from networks trained on ImageNet and histopathology data. We use an attention pooling network over these extracted features for slide-level aggregation. We investigate if features learned using more complex networks lead to gain in performance. We use a simple top-k sampling approach for fine-tuning framework and study the representation similarity between frozen and fine-tuned networks using Centered Kernel Alignment. Further, to examine if intermediate block representation is better suited for feature extraction and ImageNet architectures are unnecessarily large for histopathology, we truncate the blocks of ResNet18 and DenseNet121 and examine the performance.
翻译:数字病理学和人工智能的进步使得在数字病理学和人工智能方面的进步能够提供基于深层次学习的计算机视觉技术,用于自动化疾病诊断和预测。然而,WSI系统提出了独特的计算和算法挑战。WSI系统规模小,因此无法直接用于深神经网络的培训。因此,在建模方面,采取了一个两阶段的办法:首先提取补丁表示,然后为SI的预测汇总。这些方法要求为培训补丁编码提供详细的像素121级说明。然而,获得这些说明对医学专家来说既耗时又烦琐。在图像网络上预先训练的深学习结构用来弥补这一差距,用来产生补丁级代表。即使图像网与心病理学数据有显著差异,但经过事先训练的网络却显示,在他病理学数据数据上表现得令人印象深刻。 自我超前和多任务学习的进展,加上多种病理学数据的发布,使得他病理学特定网络的发布更加耗时费。在这项工作中,我们用在图像网络上进行深度分析的功能特征特征,我们利用这些研磨的网络,我们利用这些研磨的图像网络,在深度网络上研研研研研研研。