Phylogenetic PCA (p-PCA) is a version of PCA for observations that are leaf nodes of a phylogenetic tree. P-PCA accounts for the fact that such observations are not independent, due to shared evolutionary history. The method works on Euclidean data, but in evolutionary biology there is a need for applying it to data on manifolds, particularly shapes. We provide a generalization of p-PCA to data lying on Riemannian manifolds, called Tangent p-PCA. Tangent p-PCA thus makes it possible to perform dimension reduction on a data set of shapes, taking into account both the non-linear structure of the shape space as well as phylogenetic covariance. We show simulation results on the sphere, demonstrating well-behaved error distributions and fast convergence of estimators. Furthermore, we apply the method to a data set of mammal jaws, represented as points on a landmark manifold equipped with the LDDMM metric.
翻译:P-PCA说明了这种观测由于共同的进化历史而并非独立的这一事实。该方法在欧几里得数据上起作用,但在进化生物学中则需要将其应用于多元数据,特别是形状的数据。我们将p-PCA的概括化适用于位于里曼多元体上的数据,称为Tangent P-PCA。Tangent P-PCA因此能够对一组形状的数据集进行尺寸缩小,同时考虑到形状空间的非线性结构以及植物遗传性共变性。我们展示了球体上的模拟结果,展示了良好的误差分布和测算器的快速趋同。此外,我们还将这种方法应用于一组哺乳动物的数据集,作为带有LDDMMM 指标的标志性元件上的点。