Deep Learning (DL) can predict biomarkers from cancer histopathology. Several clinically approved applications use this technology. Most approaches, however, predict categorical labels, whereas biomarkers are often continuous measurements. We hypothesized that regression-based DL outperforms classification-based DL. Therefore, we developed and evaluated a new self-supervised attention-based weakly supervised regression method that predicts continuous biomarkers directly from images in 11,671 patients across nine cancer types. We tested our method for multiple clinically and biologically relevant biomarkers: homologous repair deficiency (HRD) score, a clinically used pan-cancer biomarker, as well as markers of key biological processes in the tumor microenvironment. Using regression significantly enhances the accuracy of biomarker prediction, while also improving the interpretability of the results over classification. In a large cohort of colorectal cancer patients, regression-based prediction scores provide a higher prognostic value than classification-based scores. Our open-source regression approach offers a promising alternative for continuous biomarker analysis in computational pathology.
翻译:深度学习(DL)可以从癌症组织学预测生物标志物。 许多临床批准的应用程序使用这项技术。 然而,大多数方法预测分类标签,而生物标志物通常是连续的测量值。 我们假设基于回归的DL优于基于分类的DL。 因此,我们开发和评估了一种新的自监督的注意力弱监督回归方法,该方法可以从9种癌症类型的11,671名患者的图像中直接预测连续的生物标志物。 我们针对多个临床和生物相关的生物标志物进行了测试:同源修复缺陷(HRD)得分,这是一种在整个癌症中使用的生物标志物,以及肿瘤微环境中关键生物过程的标记。 使用回归显着提高了生物标志物预测的准确性,同时也提高了结果的可解释性而不是分类。 在大量结直肠癌患者队列中,基于回归的预测得分比基于分类的得分提供更高的预后价值。 我们的开源回归方法为计算病理学中连续生物标志物分析提供了有希望的替代。