Spatial Transcriptomics (ST) enables the measurement of gene expression while preserving spatial information, offering critical insights into tissue architecture and disease pathology. Recent developments have explored the use of hematoxylin and eosin (H&E)-stained whole-slide images (WSIs) to predict transcriptome-wide gene expression profiles through deep neural networks. This task is commonly framed as a regression problem, where each input corresponds to a localized image patch extracted from the WSI. However, predicting spatial gene expression from histological images remains a challenging problem due to the significant modality gap between visual features and molecular signals. Recent studies have attempted to incorporate both local and global information into predictive models. Nevertheless, existing methods still suffer from two key limitations: (1) insufficient granularity in local feature extraction, and (2) inadequate coverage of global spatial context. In this work, we propose a novel framework, MMAP (Multi-MAgnification and Prototype-enhanced architecture), that addresses both challenges simultaneously. To enhance local feature granularity, MMAP leverages multi-magnification patch representations that capture fine-grained histological details. To improve global contextual understanding, it learns a set of latent prototype embeddings that serve as compact representations of slide-level information. Extensive experimental results demonstrate that MMAP consistently outperforms all existing state-of-the-art methods across multiple evaluation metrics, including Mean Absolute Error (MAE), Mean Squared Error (MSE), and Pearson Correlation Coefficient (PCC).
翻译:空间转录组学(ST)能够在保留空间信息的同时测量基因表达,为组织结构和疾病病理学提供关键见解。近期研究探索了利用苏木精-伊红(H&E)染色的全切片图像(WSI),通过深度神经网络预测全转录组基因表达谱。该任务通常被构建为回归问题,其中每个输入对应于从WSI中提取的局部图像块。然而,由于视觉特征与分子信号之间存在显著的模态差异,从组织学图像预测空间基因表达仍是一个具有挑战性的问题。现有研究尝试将局部和全局信息同时纳入预测模型,但现有方法仍存在两个关键局限:(1)局部特征提取粒度不足;(2)全局空间上下文覆盖不充分。本研究提出了一种新颖的框架MMAP(多放大倍数与原型增强架构),可同时解决这两个挑战。为增强局部特征粒度,MMAP利用多放大倍数图像块表征来捕获细粒度的组织学细节;为提升全局上下文理解,该框架学习一组潜在原型嵌入,作为切片级信息的紧凑表征。大量实验结果表明,在平均绝对误差(MAE)、均方误差(MSE)和皮尔逊相关系数(PCC)等多个评估指标上,MMAP始终优于所有现有最先进方法。