Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions.
翻译:发展和多细胞系统内的稳态都需要对空间分子模式形成和维护进行精细控制。空间分辨率和高通量分子成像方法(如多重免疫荧光和空间转录组学)的进展为我们提供了在健康和疾病中增强对这些过程的基本理解的新机会。这些技术产生的大型和复杂数据集,特别是空间转录组学,已经引发了创新的机器学习(ML)工具的快速发展,主要基于深度学习技术。现在,这些ML工具越来越多地出现在综合实验和计算工作流程中,以在复杂的生物系统中从噪声中分离信号。但是,在ST的大量分析工具中,理解和平衡不同的隐含假设和方法可能是困难的。为了解决这个问题,我们总结了ML可以帮助解决的主要ST分析目标和当前的分析趋势。我们还描述了四个主要的数据科学概念和相关的启发式,这些启发式可以帮助指导从业者在选择正确的工具来回答正确的生物问题时做出正确的选择。