Aitchison的《40年的组成数据分析:重新评估》 (Aitchison's Compositional Data Analysis 40 Years On: A Reappraisal)

from arxiv, 25 pages, 18 figures, plus Supplementary Material. This is a third revision of this paper, the main changes being in Section 6. This version has been accepted for publication in Statistical Science

The development of John Aitchison's approach to compositional data analysis is followed since his paper read to the Royal Statistical Society in 1982. Aitchison's logratio approach, which was proposed to solve the problematic aspects of working with data with a fixed sum constraint, is summarized and reappraised. It is maintained that the properties on which this approach was originally built, the main one being subcompositional coherence, are not required to be satisfied exactly -- quasi-coherence is sufficient, that is near enough to being coherent for all practical purposes. This opens up the field to using simpler data transformations, such as power transformations, that permit zero values in the data. The additional property of exact isometry, which was subsequently introduced and not in Aitchison's original conception, imposed the use of isometric logratio transformations, but these are complicated and problematic to interpret, involving ratios of geometric means. If this property is regarded as important in certain analytical contexts, for example unsupervised learning, it can be relaxed by showing that regular pairwise logratios, as well as the alternative quasi-coherent transformations, can also be quasi-isometric, meaning they are close enough to exact isometry for all practical purposes. It is concluded that the isometric and related logratio transformations such as pivot logratios are not a prerequisite for good practice, although many authors insist on their obligatory use. This conclusion is fully supported here by case studies in geochemistry and in genomics, where the good performance is demonstrated of pairwise logratios, as originally proposed by Aitchison, or Box-Cox power transforms of the original compositions where no zero replacements are necessary.

翻译：John Aitchison自1982年向皇家统计学会宣读其论文以来,就一直遵循了John Aitchison的构成数据分析方法。Aitchison的对逻辑处理方法,该方法旨在解决使用固定总数量限制的数据的棘手问题。它被总结和重新评价。它坚持认为,最初构建该方法的属性,主要是分组一致性,不需要完全满足 -- -- 准一致性就足够了,就所有实际目的而言都足够一致了。这打开了字段,可以使用更简单的数据转换方法,如权力转换,从而允许数据中出现零值。精确的测量方法的额外属性,后来被引入了,而不是在Aitchison的原始概念中,强制使用等数对逻辑转换方法,但是这些特性是复杂和难以解释的,涉及几何手段的比重。如果该属性在某些分析环境中被认为很重要,例如不精确的学习,那么可以放松。通过显示定期的对齐逻辑,以及替代的替代值转换方法在数据中是必需的。精确的精确的测量特性的特性特性特性特性是,而精确的精确的精确的逻辑分析是结论,因此,对于正确的逻辑是完全的逻辑是完全的逻辑的逻辑的。