Histopathological images of tumors contain abundant information about how tumors grow and how they interact with their micro-environment. Characterizing and improving our understanding of phenotypes could reveal factors related to tumor progression and their underpinning biological processes, ultimately improving diagnosis and treatment. In recent years, the field of histological deep learning applications has seen great progress, yet most of these applications focus on a supervised approach, relating tissue and associated sample annotations. Supervised approaches have their impact limited by two factors. Firstly, high-quality labels are expensive in time and effort, which makes them not easily scalable. Secondly, these methods focus on predicting annotations from histological images, fundamentally restricting the discovery of new tissue phenotypes. These limitations emphasize the importance of using new methods that can characterize tissue by the features enclosed in the image, without pre-defined annotation or supervision. We present Phenotype Representation Learning (PRL), a methodology to extract histomorphological phenotypes through self-supervised learning and community detection. PRL creates phenotype clusters by identifying tissue patterns that share common morphological and cellular features, allowing to describe whole slide images through compositional representations of cluster contributions. We used this framework to analyze histopathology slides of LUAD and LUSC lung cancer subtypes from TCGA and NYU cohorts. We show that PRL achieves a robust lung subtype prediction providing statistically relevant phenotypes for each lung subtype. We further demonstrate the significance of these phenotypes in lung adenocarcinoma overall and recurrence free survival, relating clusters with patient outcomes, cell types, grown patterns, and omic-based immune signatures.
翻译:肿瘤的病理深学习应用领域近些年来取得了巨大的进步,但大多数这些应用领域都侧重于监督方法、相关的组织和相关样本说明。受监督方法的影响受到两个因素的限制。首先,高品质标签在时间和努力上成本昂贵,使得它们无法容易变缩。第二,这些方法侧重于预测从肿瘤成像中预测病理学成象,从根本上限制发现新的组织性细胞型型型的发现。这些限制强调使用新方法的重要性,这些新方法能够用图像中所含特征描述组织特征的新方法的重要性,而没有预先界定的笔记或监督。我们介绍基因型代表学习(PRLL),一种通过自超学习和社区检测来提取其肿瘤型细胞型的治疗方法。首先,高品质标签在时间和努力中成本昂贵,使得它们无法轻易地与其微环境互动。第二,这些方法的重点是预测组织从组织图象中预测直系病理图像,从根本上限制发现新的组织型组织型组织模式,从根本上限制发现新的组织性组织性结构型模式的发现新的组织模式。这些限制强调使用图像,没有预先说明或监督监督。 我们介绍了基因型代表性代表的演示教学教学教学教学教学教学教学教学的教学教学结构,我们展示了这些分析的图像,并展示了这些分析的描述,我们展示了这些分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、分析、