Topological data analysis (TDA) studies the shape patterns of data. Persistent homology (PH) is a widely used method in TDA that summarizes homological features of data at multiple scales and stores them in persistence diagrams (PDs). In this paper, we propose a random persistence diagram generation (RPDG) method that generates a sequence of random PDs from the ones produced by the data. RPDG is underpinned by (i) a model based on pairwise interacting point processes for inference of persistence diagrams, and (ii) by a reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithm for generating samples of PDs. A first example, which is based on a synthetic dataset, demonstrates the efficacy of RPDG and provides a detailed comparison with other existing methods for sampling PDs. A second example demonstrates the utility of RPDG to solve a materials science problem given a real dataset of small sample size.
翻译:地形数据分析(TDA)研究数据的形状模式。持久性同系物(PH)是TDA广泛使用的一种方法,它总结了多尺度数据的同质特征,并将其储存在持久性图示(PDs)中。在本文中,我们建议了随机持久性图生成方法(RPDG),从数据产生的数据中产生随机的PD序列。 RPDG的基础是:(一) 一种基于对等相互作用点的模型,用以推断持久性图,以及(二) 一种可逆跳的马尔科夫链Monte Carlo(RJ-MC)算法,用于生成PDs样本。第一个例子是,它以合成数据集为基础,展示了RPDG的功效,并提供了与其他现有样本PDs取样方法的详细比较。第二个例子是RPDGG在解决材料科学问题的有用性,因为实际的样本规模很小。