Spatial transcriptomics is a modern sequencing technology that allows the measurement of the activity of thousands of genes in a tissue sample and map where the activity is occurring. This technology has enabled the study of the so-called spatially expressed genes, i.e., genes which exhibit spatial variation across the tissue. Comprehending their functions and their interactions in different areas of the tissue is of great scientific interest, as it might lead to a deeper understanding of several key biological mechanisms. However, adequate statistical tools that exploit the newly spatial mapping information to reach more specific conclusions are still lacking. In this work, we introduce SpaRTaCo, a new statistical model that clusters the spatial expression profiles of the genes according to the areas of the tissue. This is accomplished by performing a co-clustering, i.e., inferring the latent block structure of the data and inducing two types of clustering: of the genes, using their expression across the tissue, and of the image areas, using the gene expression in the spots where the RNA is collected. Our proposed methodology is validated with a series of simulation experiments and its usefulness in responding to specific biological questions is illustrated with an application to a human brain tissue sample processed with the 10X-Visium protocol.
翻译:空间光谱学是一种现代测序技术,可以测量组织样本中数千个基因的活动,并绘制活动所在的地图。这一技术使得能够研究所谓的空间表达基因,即各组织间呈现空间变化的基因。在组织的不同领域对其功能和相互作用进行合成具有极大的科学意义,因为它可能导致对若干关键生物机制的更深入了解。然而,仍然缺乏足够的统计工具,利用新空间绘图信息得出更具体的结论。我们在此工作中采用了SpaRTaco,这是一个新的统计模型,根据组织领域将基因的空间表达特征分组。通过进行共同组合,即推断数据的潜在块状结构,引出两种类型的集群:基因,利用它们在整个组织中的表达方式,以及图像区域,利用收集RNA的点的基因表达方式。我们提出的方法经过一系列模拟实验验证,及其在对特定生物组织领域进行响应时的实用性,通过对10个大脑样本进行对10个大脑进行扫描,对10个大脑样本进行应用。