Phylogenetic networks are a flexible model of evolution that can represent reticulate evolution and handle complex data. Tree-based networks, which are phylogenetic networks that have a spanning tree with the same root and leaf-set as the network itself, have been well studied. However, not all networks are tree-based. Francis-Semple-Steel (2018) thus introduced several indices to measure the deviation of rooted binary phylogenetic networks $N$ from being tree-based, such as the minimum number $\delta^\ast(N)$ of additional leaves needed to make $N$ tree-based, and the minimum difference $\eta^\ast(N)$ between the number of vertices of $N$ and the number of vertices of a subtree of $N$ that shares the root and leaf set with $N$. Hayamizu (2021) has established a canonical decomposition of almost-binary phylogenetic networks of $N$, called the maximal zig-zag trail decomposition, which has many implications including a linear time algorithm for computing $\delta^\ast(N)$. The Maximum Covering Subtree Problem (MCSP) is the problem of computing $\eta^\ast(N)$, and Davidov et al. (2022) showed that this can be solved in polynomial time (in cubic time when $N$ is binary) by an algorithm for the minimum cost flow problem. In this paper, under the assumption that $N$ is almost-binary (i.e. each internal vertex has in-degree and out-degree at most two), we show that $\delta^\ast(N)\leq \eta^\ast (N)$ holds, which is tight, and give a characterisation of such phylogenetic networks $N$ that satisfy $\delta^\ast(N)=\eta^\ast(N)$. Our approach uses the canonical decomposition of $N$ and focuses on how the maximal W-fences (i.e. the forbidden subgraphs of tree-based networks) are connected to maximal M-fences in the network $N$. Our results introduce a new class of phylogenetic networks for which MCSP can be solved in linear time, which can be seen as a generalisation of tree-based networks.
翻译:暂无翻译