Topological data analysis, including persistent homology, has undergone significant development in recent years. However, one outstanding challenge is to build a coherent statistical inference procedure on persistent diagrams. The paired dependent data structure, which are the births and deaths in persistent diagrams, adds complexity to statistical inference. In this paper, we present a new lattice path representation for persistent diagrams. A new exact statistical inference procedure is developed for lattice paths via combinatorial enumerations. The proposed lattice path method is applied to study the topological characterization of the protein structures of the COVID-19 virus. We demonstrate that there are topological changes during the conformational change of spike proteins, a necessary step in infecting host cells.
翻译:近年来,包括持久性同系物在内的地形学数据分析经历了重大发展,然而,一个突出的挑战是如何在持久性图表上建立一致的统计推理程序。双向依赖数据结构,即持久性图表中的出生和死亡,增加了统计推理的复杂性。在本文中,我们为持久性图表提出了一个新的细线路径代表。通过组合式查点为细线路径制定了新的精确统计推理程序。拟议的拉特路径方法用于研究COVID-19病毒蛋白结构的表层定性。我们表明,在钉子蛋白质的相容变化中,存在着地形变化,这是感染宿主细胞的一个必要步骤。