Inference of species networks from genomic data under the Network Multispecies Coalescent Model is currently severely limited by heavy computational demands. It also remains unclear how complicated networks can be for consistent inference to be possible. As a step toward inferring a general species network, this work considers its tree of blobs, in which non-cut edges are contracted to nodes, so only tree-like relationships between the taxa are shown. An identifiability theorem, that most features of the unrooted tree of blobs can be determined from the distribution of gene quartet topologies, is established. This depends upon an analysis of gene quartet concordance factors under the model, together with a new combinatorial inference rule. The arguments for this theoretical result suggest a practical algorithm for tree of blobs inference, to be fully developed in a subsequent work.
翻译:根据网络多物种群落模型的基因组数据对物种网络的推论目前受到大量计算要求的严重限制,目前还不清楚网络的复杂程度如何才能得出一致推论。作为推算一般物种网络的一个步骤,这项工作考虑的是其小树,其中将非切割边缘承包给结点,因此只显示分类群之间与树相似的关系。一个可识别性理论,即未扎根的花叶树的大部分特征可以通过基因四重奏表层的分布来确定。这取决于对模型下的基因四重奏和谐因素的分析,以及一项新的组合推论规则。这一理论结果的论据表明,在随后的工作中,将充分开发出关于花叶推断树的实用算法。