Tumour heterogeneity in breast cancer poses challenges in predicting outcome and response to therapy. Spatial transcriptomics technologies may address these challenges, as they provide a wealth of information about gene expression at the cell level, but they are expensive, hindering their use in large-scale clinical oncology studies. Predicting gene expression from hematoxylin and eosin stained histology images provides a more affordable alternative for such studies. Here we present BrST-Net, a deep learning framework for predicting gene expression from histopathology images using spatial transcriptomics data. Using this framework, we trained and evaluated 10 state-of-the-art deep learning models without utilizing pretrained weights for the prediction of 250 genes. To enhance the generalisation performance of the main network, we introduce an auxiliary network into the framework. Our methodology outperforms previous studies, with 237 genes identified with positive correlation, including 24 genes with a median correlation coefficient greater than 0.50. This is a notable improvement over previous studies, which could predict only 102 genes with positive correlation, with the highest correlation values ranging from 0.29 to 0.34.
翻译:肿瘤异质性在乳腺癌的预测结果和治疗反应上存在困难。空间转录组学技术可能解决这些挑战,因为它们提供了关于细胞水平的基因表达的丰富信息,但是它们是昂贵的,阻碍了它们在大规模临床肿瘤学研究中的使用。从苏木精伊红染色组织学图像中预测基因表达提供了一种更实惠的替代方案。在这里,我们提出了BrST-Net,这是一个用于使用空间转录组学数据从组织病理学图像中预测基因表达的深度学习框架。使用此框架,我们训练和评估了10个最先进的深度学习模型,而不使用预先训练的权重,以预测250个基因。为了增强主网络的泛化性能,我们将一个辅助网络引入到框架中。我们的方法优于以前的研究,确定了237个呈正相关的基因,包括24个中位相关系数大于0.50的基因。这是对以前的研究的显着改进,以前的研究只能预测与正相关的102个基因,最高相关系数值为0.29至0.34。