Cell biologists study in parallel the morphology of cells with the regulation mechanisms that modify this morphology. Such studies are complicated by the inherent heterogeneity present in the cell population. It remains difficult to define the morphology of a cell with parameters that can quantify this heterogeneity, leaving the cell biologist to rely on manual inspection of cell images. We propose an alternative to this manual inspection that is based on topological data analysis. We characterise the shape of a cell by its contour and nucleus. We build a filtering of the edges defining the contour using a radial distance function initiated from the nucleus. This filtering is then used to construct a persistence diagram that serves as a signature of the cell shape. Two cells can then be compared by computing the Wasserstein distance between their persistence diagrams. Given a cell population, we then compute a distance matrix that includes all pairwise distances between its members. We analyse this distance matrix using hierarchical clustering with different linkage schemes and define a purity score that quantifies consistency between those different schemes, which can then be used to assess homogeneity within the cell population. We illustrate and validate our approach to identify sub-populations in human mesenchymal stem cell populations.
翻译:细胞生物学家在探究细胞形态的同时,也研究调控该形态的分子机制。此类研究因细胞群体固有的异质性而变得复杂。目前仍难以用能够量化这种异质性的参数来定义细胞形态,导致细胞生物学家不得不依赖对细胞图像的人工观察。我们提出一种基于拓扑数据分析的替代方案,以克服人工观察的局限性。我们通过细胞的轮廓与细胞核来表征其形态。利用从细胞核起始的径向距离函数,构建轮廓边缘的滤波过程。该滤波随后用于构建持久性图,作为细胞形态的特征标识。通过计算两个细胞持久性图之间的Wasserstein距离,即可实现细胞间的形态比较。对于给定的细胞群体,我们计算包含所有成员间两两距离的距离矩阵。采用不同连接方式的层次聚类法分析该距离矩阵,并定义纯度评分以量化不同连接方式间的一致性,进而评估细胞群体的同质性。我们通过识别人间充质干细胞群体中的亚群,对本方法进行了示例与验证。