Data Science aims to extract meaningful knowledge from unorganised data. Real datasets usually come in the form of a cloud of points with only pairwise distances. Numerous applications require to visualise an overall shape of a noisy cloud of points sampled from a non-linear object that is more complicated than a union of disjoint clusters. The skeletonisation problem in its hardest form is to find a 1-dimensional skeleton that correctly represents a shape of the cloud. This paper compares several algorithms that solve the above skeletonisation problem for any point cloud and guarantee a successful reconstruction. For example, given a highly noisy point sample of an unknown underlying graph, a reconstructed skeleton should be geometrically close and homotopy equivalent to (has the same number of independent cycles as) the underlying graph. One of these algorithm produces a Homologically Persistent Skeleton (HoPeS) for any cloud without extra parameters. This universal skeleton contains sub-graphs that provably represent the 1-dimensional shape of the cloud at any scale. Other subgraphs of HoPeS reconstruct an unknown graph from its noisy point sample with a correct homotopy type and within a small offset of the sample. The extensive experiments on synthetic and real data reveal for the first time the maximum level of noise that allows successful graph reconstructions.
翻译:数据科学旨在从无组织的数据中获取有意义的知识。 真实的数据集通常以云层形式出现, 其间距离只有对齐。 许多应用都要求将非线性物体取样的热点云层的总体形状直观化为从非线性物体中采集的热点云层,这种云层比不相交的星团要复杂得多。 最难的骨质化问题是找到一个能正确代表云层形状的一维骨架。 本文比较了用来解决任何点云层的上述骨质化问题的几种算法, 并保证一个成功的重建。 例如, 在一个未知的底图的极热点样本中, 重建后的骨骼应该是地理上接近和同质等同的。 其中一种算法为任何没有额外参数的云层生成一个具有逻辑性的单向常态Skeleton (Hopees) 。 这个通用的骨质结构包含一些子图, 可以代表任何规模的云层的一维度形状。 其它的子图像组将一个未知的图形从一个不为精确的点点样本中重建, 以正确的同质类型和在一个小的图层中, 能够对真实的图像进行广泛的实验。