持久性空间近邻近邻图 (Approximate Nearest Neighbors in the Space of Persistence Diagrams)

Persistence diagrams are important tools in the field of topological data analysis that describe the presence and magnitude of features in a filtered topological space. However, current approaches for comparing a persistence diagram to a set of other persistence diagrams is linear in the number of diagrams or do not offer performance guarantees. In this paper, we apply concepts from locality-sensitive hashing to support approximate nearest neighbor search in the space of persistence diagrams. Given a set $\Gamma$ of $n$ $(M,m)$-bounded persistence diagrams, each with at most $m$ points, we snap-round the points of each diagram to points on a cubical lattice and produce a key for each possible snap-rounding. Specifically, we fix a grid over each diagram at several resolutions and consider the snap-roundings of each diagram to the four nearest lattice points. Then, we propose a data structure with $\tau$ levels $\mathbb{D}_{\tau}$ that stores all snap-roundings of each persistence diagram in $\Gamma$ at each resolution. This data structure has size $O(n5^m\tau)$ to account for varying lattice resolutions as well as snap-roundings and the deletion of points with low persistence. To search for a persistence diagram, we compute a key for a query diagram by snapping each point to a lattice and deleting points of low persistence. Furthermore, as the lattice parameter decreases, searching our data structure yields a six-approximation of the nearest diagram in $\Gamma$ in $O((m\log{n}+m^2)\log\tau)$ time and a constant factor approximation of the $k$th nearest diagram in $O((m\log{n}+m^2+k)\log\tau)$ time.

翻译： Persistant 图表是表层数据分析领域的重要工具, 描述在过滤的表层空间中存在和规模的特征。然而, 目前将持续性图与一组其他持久性图比较的方法在图表数量上是线性的, 或者不提供性能保障。在本文中, 我们使用对地敏感的散列概念支持在持久性图空间中近邻搜索。一套美元为$( m, m) 的直线持续性图表, 每种以美元为单位, 我们将每张图的点折叠到每张直线图的点上。具体地说, 我们在几个分辨率上为每张图设置一个网格, 考虑每张图的折圈到四个最接近的拉蒂点。然后, 我们提出一个数据结构, 美元为美元=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx