Beyond identifying genetic variants, we introduce a set of Boolean relations that allows for a comprehensive classification of the relation for every pair of variants by taking all minimal alignments into account. We present an efficient algorithm to compute these relations, including a novel way of efficiently computing all minimal alignments within the best theoretical complexity bounds. We show that for all variants of the CFTR gene in dbSNP these relations are common and many non-trivial. Ultimately, we present an approach for the storing and indexing of variants in the context of a database that enables the efficient querying for all these relations.
翻译:除了确定基因变异之外,我们还引入了一套布林关系,通过考虑到所有最低程度的调整,对每种变异之间的关系进行全面分类;我们提出了计算这些关系的有效算法,包括一种在最佳理论复杂性范围内有效计算所有最低调整的新方式;我们表明,对于dbSNP的所有变异的FLFTR基因而言,这种关系是共同的,而且有许多非三重关系。最终,我们提出了一个在数据库中储存和编制变异数据的方法,以便有效地查询所有这些关系。