We consider the 2-Wasserstein space of probability measures supported on the unit-circle, and propose a framework for Principal Component Analysis (PCA) for data living in such a space. We build on a detailed investigation of the optimal transportation problem for measures on the unit-circle which might be of independent interest. In particular, we derive an expression for optimal transport maps in (almost) closed form and propose an alternative definition of the tangent space at an absolutely continuous probability measure, together with the associated exponential and logarithmic maps. PCA is performed by mapping data on the tangent space at the Wasserstein barycentre, which we approximate via an iterative scheme, and for which we establish a sufficient a posteriori condition to assess its convergence. Our methodology is illustrated on several simulated scenarios and a real data analysis of measurements of optical nerve thickness.
翻译:我们考虑支撑在单位圆上的概率测度的2-Wasserstein空间,并提出了一个在该空间中进行数据主成分分析(PCA)的框架。我们在对单位圆上的测度最优运输问题进行详细研究的基础上提出了该框架,该问题本身可能具有独立的利益。特别地,我们导出了几乎闭形式的测度上的最优运输映射的表达式,并提出了在绝对连续概率测度的切空间处的替代定义,以及相应的指数和对数映射。PCA是通过将数据映射到Wasserstein重心处的切空间上执行的,我们通过迭代方案来近似重心,并建立了一个足够的后验条件来评估其收敛性。我们的方法在多个模拟场景和眼神经光纤厚度测量的真实数据分析中进行了说明。