Several implicit methods to infer Horizontal Gene Transfer (HGT) focus on pairs of genes that have diverged only after the divergence of the two species in which the genes reside. This situation defines the edge set of a graph, the later-divergence-time (LDT) graph, whose vertices correspond to genes colored by their species. We investigate these graphs in the setting of relaxed scenarios, i.e., evolutionary scenarios that encompass all commonly used variants of duplication-transfer-loss scenarios in the literature. We characterize LDT graphs as a subclass of properly vertex-colored cographs, and provide a polynomial-time recognition algorithm as well as an algorithm to construct a relaxed scenario that explains a given LDT. An edge in an LDT graph implies that the two corresponding genes are separated by at least one HGT event. The converse is not true, however. We show that the complete xenology relation is described by an rs-Fitch graph, i.e., a complete multipartite graph satisfying constraints on the vertex coloring. This class of vertex-colored graphs is also recognizable in polynomial time. We finally address the question "how much information about all HGT events is contained in LDT graphs" with the help of simulations of evolutionary scenarios with a wide range of duplication, loss, and HGT events. In particular, we show that a simple greedy graph editing scheme can be used to efficiently detect HGT events that are implicitly contained in LDT graphs.
翻译:以几种隐含的方法推断横向基因转移(HGT) 聚焦于在基因所居住的两种物种差异之后才出现差异的基因配对。 这一情况定义了图表的边缘, 即后发时间( LDT) 图形, 其顶点与物种所染色的基因相对应。 我们在设定宽松的假设时调查这些图表, 即进化假设, 包括文献中重复- 转移- 损失情景的所有常用变量。 我们将LDT 图形定性为适当的顶端色白字的子类, 并提供多元时间识别算法, 以及一种算法来构建一个轻松的假设, 解释给给定的 LDTT。 LDT 图形中的边缘意味着两个相应的基因至少被 HGT 事件分开一次。 但是, 反向是真实的。 我们显示完整的十进化关系由r- Fitch 图表描述, 即一个完整的多部分图表解析点, 也包含我们GLDT 图表中的最后一个解点, 我们的GLGDLL 图表的图程图程图中, 包含一个我们GDGLLLLLLLLLL 的图程的图程程程程中包含所有的图。