This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fitting energy. We introduce an efficient, iterative algorithm which exploits shared-memory parallelism, as well as an analytic expression of the fitting energy gradient, to ensure fast iterations. Our approach also trivially extends to extremum persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our approach - with MT-PGA computations in the orders of minutes for the largest examples. We show the utility of our contributions by extending to merge trees two typical PCA applications. First, we apply MT-PGA to data reduction and reliably compress merge trees by concisely representing them by their first coordinates in the MT-PGA basis. Second, we present a dimensionality reduction framework exploiting the first two directions of the MT-PGA basis to generate two-dimensional layouts of the ensemble. We augment these layouts with persistence correlation views, enabling global and local visual inspections of the feature variability in the ensemble. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used to reproduce our results.
翻译:本文为合并树木的主要大地测量分析(MT-PGA)提供了一个计算框架,这是对著名的主要组成部分分析(PCA)框架(PAC)[87]的新修改,以适应瓦塞斯坦合并树木的瓦塞斯坦度空间[92]。我们将MT-PGA计算作为一个限制优化的问题,目的是调整正方形大地测量轴的基础,同时尽量减少适当的能量。我们引入了一种高效的迭代算法,利用共享的平行关系,以及精确地表达适合的能源梯度,以确保快速迭代。我们的方法还微不足道地延伸至极端的持久性图表。关于公共集合的广泛实验显示了我们的方法的效率,MT-PGA计算为最大例子在几分钟的顺序中进行计算。我们通过将两种典型的CPA应用加以合并,显示了我们的贡献的效用。首先,我们运用MT-PGA对数据进行简化和可靠地拼凑树木的组合,在MT-PGA的第一次坐标坐标坐标上代表它们。我们提出了一种维度削减框架,我们用这些直观的模型进行最终的升级框架。