A recently introduced approach termed "Assembly Theory", featuring a computable index based on basic principles of statistical compression has been claimed to be a novel and superior approach to classifying and distinguishing living from non-living systems and the complexity of molecular biosignatures. Here, we demonstrate that the assembly pathway method underlying this index is a suboptimal restricted version of Huffman's encoding (Shannon-Fano type), widely adopted in computer science in the 1950s, that is comparable (or inferior) to other popular statistical and computable compression schemes. We show how simple modular instructions can mislead the assembly index, leading to failure to capture subtleties beyond trivial statistical properties that are not realistic in biological systems. We present cases whose low complexities can arbitrarily diverge from the random-like appearance to which the assembly pathway method would assign arbitrarily high statistical significance, and show that it fails in simple cases (synthetic or natural). Our theoretical and empirical results imply that the assembly index, whose computable nature we show is not an advantage, does not offer any substantial advantage over existing concepts and methods computable or uncomputable. Alternatives are discussed.
翻译:最近采用了一种方法,称为“Assembly Theory”, 其特点是基于统计压缩基本原则的可计算指数。 人们认为,简单模块指示可如何误导组装指数,导致无法捕捉到生物系统不现实的细微统计特性以外的细微细节。 我们提出一些案例,这些案例的低复杂性可能任意偏离组装路径方法赋予任意高统计意义的随机外观,并表明它在简单案例(合成或自然)中失败。 我们的理论和实验结果表明,组装指数(我们所显示的可比较性质并非优势)不会给现有可比较或不可比较的概念和方法带来任何重大优势。