Reeb spaces, as well as their discretized versions called Mappers, are common descriptors used in Topological Data Analysis, with plenty of applications in various fields of science, such as computational biology and data visualization, among others. The stability and quantification of the rate of convergence of the Mapper to the Reeb space has been studied a lot in recent works [BBMW19, CO17, CMO18, MW16], focusing on the case where a scalar-valued filter is used for the computation of Mapper. On the other hand, much less is known in the multivariate case, when the codomain of the filter is $\mathbb{R}^p$, and in the general case, when it is a general metric space $(Z, d_Z)$, instead of $\mathbb{R}$. The few results that are available in this setting [DMW17, MW16] can only handle continuous topological spaces and cannot be used as is for finite metric spaces representing data, such as point clouds and distance matrices. In this article, we introduce a slight modification of the usual Mapper construction and we give risk bounds for estimating the Reeb space using this estimator. Our approach applies in particular to the setting where the filter function used to compute Mapper is also estimated from data, such as the eigenfunctions of PCA. Our results are given with respect to the Gromov-Hausdorff distance, computed with specific filter-based pseudometrics for Mappers and Reeb spaces defined in [DMW17]. We finally provide applications of this setting in statistics and machine learning for different kinds of target filters, as well as numerical experiments that demonstrate the relevance of our approach
翻译:Reeb 空间及其离散版本称为 Mappers, 是地形数据分析中常用的描述符, 包括计算生物学和数据可视化等多个科学领域的大量应用。 在最近的作品中,[ BBMW19、 CO17、 CMO18、 MW16] 大量研究了Mapper与Reeb空间汇合率的稳定性和量化问题, 重点是在计算地图器时使用标值的刻度过滤器的情况。 另一方面, 在多变量中, 当过滤器的焦距值为$\ mathb{R ⁇ p$, 以及一般情况下, 当它是通用的衡量空间空间的基值 $( Z, d ⁇ ), 而不是$\ mathb{ R} 。 本设置[ DMW17, MW16] 中现有的少量结果只能处理连续的表层空间, 无法用于代表数据的定值空间基度空间, 如点云和距离矩阵。 在文章中, 我们对普通的地图的精确值应用略微修改, 用于我们通常的图像的精确度的精确度的计算结果, 也是我们用来估算的精确的计算结果的计算。, 我们的测测算中, 我们的精确的精确的精确的精确的计算结果, 最终的测算中, 我们的测算中, 我们的精确的测算的精确的计算结果是用于了我们的精确的精确的精确的计算结果的计算, 。