In recent years, aligning a sequence to a pangenome has become a central problem in genomics and pangenomics. A fast and accurate solution to this problem can serve as a toolkit to many crucial tasks such as read-correction, Multiple Sequences Alignment (MSA), genome assemblies, variant calling, just to name a few. In this paper we propose a new, fast and exact method to align a string to a D-string, the latter possibly representing an MSA, a pan-genome or a partial assembly. An implementation of our tool dsa is publicly available at https://github.com/urbanslug/dsa
翻译:近些年来,将一个序列与一个整形体相匹配已成为基因组学和全基因组学中的一个中心问题。快速和准确地解决这一问题可以作为许多关键任务的工具,例如读校、多重序列对齐、基因组组组、调用变体等等。在本文中,我们提出了一个新的、快速和精确的方法来将一个字符串与一个D字符串相匹配,后者可能代表一个管理事务协议、一个全基因组或一个部分组装。我们工具dsa的实施可以在https://github.com/burbasslug/dsa上公开查阅。