FM-index is an efficient data structure for string search and is widely used in next-generation sequencing (NGS) applications such as sequence alignment and de novo assembly. Recently, FM-indexing is even performed down to the read level, raising a demand of an efficient algorithm for FM-index construction. In this work, we propose a hardware-compatible Self-Aided Incremental Indexing (SAII) algorithm and its hardware architecture. This novel algorithm builds FM-index with no memory overhead, and the hardware system for realizing the algorithm can be very compact. Parallel architecture and a special prefetch controller is designed to enhance computational efficiency. An SAII-based FM-index constructor is implemented on an Altera Stratix V FPGA board. The presented constructor can support DNA sequences of sizes up to 131,072-bp, which is enough for small-scale references and reads obtained from current major platforms. Because the proposed constructor needs very few hardware resource, it can be easily integrated into different hardware accelerators designed for FM-index-based applications.
翻译:调频- index 是用于字符串搜索的有效数据结构, 并广泛用于下一代排序( NGS) 应用程序, 如序列对齐和重新组装等 。 最近, 调频索引甚至进行到读水平, 提高了调频索引构建的高效算法需求 。 在这项工作中, 我们提议了一个硬件兼容的自援递增索引算法及其硬件结构 。 这个新奇算法建立调频索引, 没有记忆管理, 实现算法的硬件系统非常紧凑 。 平行架构和一个特殊的预发算控制器被设计来提高计算效率 。 一个基于 SAII 的调频索引构建器被安装在 Altera Stratix V FPGA 板上 。 演示的构建器可以支持高达131,072- bp 的DNA序列, 它足以用于小规模的参考, 并读取当前主要平台的读取 。 由于拟议的构建器需要非常少的硬件资源, 它很容易被整合成为基于调频index- 应用的不同硬件加速器 。