LUMÁWIG: 地形数据分析中零瓶点尺寸远距离计算的有效算法 (LUMÁWIG: An Efficient Algorithm for Dimension Zero Bottleneck Distance Computation in Topological Data Analysis)

Stability of persistence diagrams under slight perturbations is a key characteristic behind the validity and growing popularity of topological data analysis in exploring real-world data. Central to this stability is the use of Bottleneck distance which entails matching points between diagrams. Use of this metric in practical studies has, however, been few and sparingly because of the computational obstruction, especially in dimension zero where the computational cost explodes with the growth of data size. We present LUM\'AWIG, a novel efficient algorithm to compute dimension zero bottleneck distance between two persistent diagrams which runs significantly faster and provides significantly sharper approximates with respect to the output of the original algorithm than any other available algorithm. We bypass the overwhelming matching problem in previous implementations of the bottleneck distance, and prove that the zero dimensional bottleneck distance can be recovered from a very small number of matching cases. We show that LUM\'AWIG generally enjoys linear complexity as shown by empirical tests. We also present an application that leverages dimension zero persistence diagrams and the bottleneck distance to produce features for classification tasks.

翻译：在轻微扰动下,持久性图的稳定性是一个关键特征,这是在探索真实世界数据时进行地形数据分析的有效性和越来越受欢迎的关键特征。这种稳定性的核心是使用波特勒内克距离,这需要图表之间的匹配点。然而,在实际研究中,由于计算障碍,使用这一指标的次数很少,而且不那么多,特别是在计算成本随着数据大小增长而爆炸的零度方面。我们提出了LUM\'AGIG,这是一种新颖的高效算法,用以计算两个持续图之间的维度零瓶颈距离,其运行速度要快得多,并且比任何其他可用的算法都对原始算法的输出提供了更精确的近似值。我们绕过了以前实施瓶颈距离的绝大多数匹配问题,并证明从极少数匹配案例中可以恢复零维特级瓶颈距离。我们表明,LUM\'AGIG通常具有线性复杂性,正如经验测试所显示的那样。我们还介绍了一种应用,即利用维度零持久性图和瓶内克距离来产生分类任务的特点。