Current approaches for classification of whole slide images (WSI) in digital pathology predominantly utilize a two-stage learning pipeline. The first stage identifies areas of interest (e.g. tumor tissue), while the second stage processes cropped tiles from these areas in a supervised fashion. During inference, a large number of tiles are combined into a unified prediction for the entire slide. A major drawback of such approaches is the requirement for task-specific auxiliary labels which are not acquired in clinical routine. We propose a novel learning pipeline for WSI classification that is trainable end-to-end and does not require any auxiliary annotations. We apply our approach to predict molecular alterations for a number of different use-cases, including detection of microsatellite instability in colorectal tumors and prediction of specific mutations for colon, lung, and breast cancer cases from The Cancer Genome Atlas. Results reach AUC scores of up to 94% and are shown to be competitive with state of the art two-stage pipelines. We believe our approach can facilitate future research in digital pathology and contribute to solve a large range of problems around the prediction of cancer phenotypes, hopefully enabling personalized therapies for more patients in future.
翻译:在数字病理学中,目前对整个幻灯片图像进行分类的方法主要使用两阶段学习管道。第一阶段确定感兴趣的领域(如肿瘤组织),第二阶段则以监督的方式从这些地区种植瓷砖。在推断过程中,大量瓷砖被结合到对整个幻灯片的统一预测中。这类方法的一个主要缺点是需要有在临床常规中没有获得的任务特定辅助标签。我们提议为统计研究所分类提供一个新的学习管道,这是可训练的端到端,不需要任何辅助说明。我们采用的方法预测不同使用案例的分子变化,包括发现肠切肿瘤中的微型卫星不稳定性,预测癌症基因图集中结肠、肺和乳腺癌病例的具体突变情况。结果达到94%的ACU分数,并显示与两阶段技术管道的状况具有竞争力。我们认为,我们的方法可以促进今后对数字病理学的研究,并有助于解决癌症苯型患者预测方面的一系列大问题,希望个人化疗法能够在未来为病人提供更先进的个人化疗法。