Likelihood-based methods are widely considered the best approaches for reconstructing ancestral states. Although much effort has been made to study properties of these methods, previous works often assume that both the tree topology and edge lengths are known. In some scenarios the tree topology might be reasonably well known for the taxa under study. When sequence length is much smaller than the number of species, however, edge lengths are not likely to be accurately estimated. We study the consistency of the maximum likelihood and empirical Bayes estimators of ancestral state of discrete traits in such settings under a star tree. We prove that the likelihood-based reconstruction is consistent under symmetric models but can be inconsistent under non-symmetric models. We show, however, that a simple consistent estimator for the ancestral states is available under non-symmetric models. The results illustrate that likelihood methods can unexpectedly have undesirable properties as the number of sequences considered get very large. Broader implications of the results are discussed.
翻译:以隐性为基础的方法被广泛视为重建祖先国家的最佳方法。虽然已经为研究这些方法的特性作出了很大努力,但先前的工作往往假定树木的地形和边缘长度都为人所知。在有些情况下,正在研究的分类群可能相当了解树的地形。当序列长度大大小于物种数量时,边缘长度不太可能准确估计。我们研究了在恒星树下这种环境中祖传离异特征状态的最大可能性和经验性的贝亚斯估计者的一致性。我们证明,基于可能性的重建在对称模型下是一致的,但在非对称模型下可能是不一致的。然而,我们表明,在非对称模型下,祖传国家有一个简单一致的估算符。结果表明,随着所考虑的序列数量大,可能出乎意料地产生不良的特性。我们讨论了结果的更广泛影响。