Genome rearrangement distances are an established method in genome comparison. Works in this area may include various rearrangement operations representing large-scale mutations, gene orientation information, the number of nucleotides in intergenic regions, and weights reflecting the expected frequency of each operation. In this article, we model genomes containing at most one copy of each gene by considering gene sequences, with orientations, and representing intergenic regions according to their nucleotide lengths. We looked at a problem called Weighted Reversal, Transposition, and Indel Distance, which seeks the minimal cost sequence composed by the rearrangement operations of reversals, transposition, and indels, capable of transforming one genome into another. We leverage a structure called Labeled Intergenic Breakpoint Graph to show an algorithm for that problem with guaranteed approximations considering some sets of weights for the operations.
翻译:基因组重排距离是基因组比较中一种成熟的方法。该领域的研究可能涉及多种代表大规模突变的基因组重排操作、基因方向信息、基因间区域的核苷酸数量,以及反映各操作预期频率的权重。在本文中,我们通过考虑带有方向的基因序列,并根据其核苷酸长度表示基因间区域,对每个基因至多出现一次的基因组进行建模。我们研究了一个称为加权反转、转座与插入缺失距离的问题,该问题旨在寻找由反转、转座和插入缺失这些重排操作组成的最小代价序列,以实现一个基因组向另一个基因组的转化。我们利用一种称为标记基因间断点图的结构,针对该问题提出了一种算法,该算法在考虑特定操作权重集合的情况下具有可保证的近似比。