Traditional methods for the analysis of compositional data consider the log-ratios between all different pairs of variables with equal weight, typically in the form of aggregated contributions. This is not meaningful in contexts where it is known that a relationship only exists between very specific variables (e.g.~for metabolomic pathways), while for other pairs a relationship does not exist. Modeling absence or presence of relationships is done in graph theory, where the vertices represent the variables, and the connections refer to relations. This paper links compositional data analysis with graph signal processing, and it extends the Aitchison geometry to a setting where only selected log-ratios can be considered. The presented framework retains the desirable properties of scale invariance and compositional coherence. An additional extension to include absolute information is readily made. Examples from bioinformatics and geochemistry underline the usefulness of thisapproach in comparison to standard methods for compositional data analysis.
翻译:传统的构成数据分析方法考虑到所有不同不同变数之间具有同等重量的对数之间的日志比值,通常以综合贡献的形式出现。在已知只有非常具体的变数(例如对于代谢路径而言)之间存在关系,而对于其他对数则不存在关系的情况下,这没有意义。在图形理论中进行没有或存在关系的建模,在图表理论中,脊椎代表变量,连接是指关系。本文将组成数据分析与图形信号处理联系起来,并将Aitchison几何法延伸至只考虑选定日志的设置。提出的框架保留了规模变化和构成一致性的可取特性。还很容易将绝对信息包括在内。生物信息学和地理化学学的例子强调了该方程式与构成数据分析标准方法相比较的有用性。