Single-cell RNA sequencing (scRNA-seq) and spatially-resolved imaging/sequencing technologies have revolutionized biomedical research. On one hand, scRNA-seq data provides for individual cells information about a large portion of the transcriptome, but does not include the spatial context of the cells. On the other hand, spatially resolved measurements come with a trade-off between resolution, throughput and gene coverage. Combining data from these two modalities can provide a spatially resolved picture with enhances resolution and gene coverage. Several methods have been recently developed to integrate these modalities, but they use only the expression of genes available in both modalities. They don't incorporate other relevant and available features, especially the spatial context. We propose DOT, a novel optimization framework for assigning cell types to tissue locations. Our model (i) incorporates ideas from Optimal Transport theory to leverage not only joint but also distinct features, such as the spatial context, (ii) introduces scale-invariant distance functions to account for differences in the sensitivity of different measurement technologies, and (iii) provides control over the abundance of cells of different types in the tissue. We present a fast implementation based on the Frank-Wolfe algorithm and we demonstrate the effectiveness of DOT on correctly assigning cell types or estimating the expression of missing genes in spatial data coming from two areas of the brain, the developing heart, and breast cancer samples.
翻译:单细胞RNA测序(scRNA-seq)和空间溶解成像/序列技术使生物医学研究革命化。一方面,ScRNA-seq数据为个别细胞提供了有关大量转录器的信息,但并不包括细胞的空间背景。另一方面,空间溶解测量结果与分辨率、吞吐量和基因覆盖之间的权衡取舍。这两种模式的数据相结合,可以提供空间溶解图象,增强分辨率和基因覆盖面。最近开发了几种方法来整合这些模式,但它们只使用两种模式中的基因表达方式。它们不包含其他相关和可用的特征,特别是空间背景。我们建议DOT,这是一个将细胞类型分配到组织位置的新优化框架。我们的模型(一)吸收了最佳运输理论的想法,不仅利用共同的,而且利用不同的特征,如空间环境,(二)引入规模变异功能,以考虑到不同测量技术的敏感性,并且它们只是使用两种模式中的基因的表达方式。它们并不包含其它相关和可用的特征,特别是空间环境背景。我们建议DOT。我们快速地展示了不同类型基因序列的细胞结构的丰度,我们正在快速地展示了不同类型中测算。