Genome sequence alignment is the core of many biological applications. The advancement of sequencing technologies produces a tremendous amount of data, making sequence alignment a critical bottleneck in bioinformatics analysis. The existing hardware accelerators for alignment suffer from limited on-chip memory, costly data movement, and poorly optimized alignment algorithms. They cannot afford to concurrently process the massive amount of data generated by sequencing machines. In this paper, we propose a ReRAM-based accelerator, RAPIDx, using processing in-memory (PIM) for sequence alignment. RAPIDx achieves superior efficiency and performance via software-hardware co-design. First, we propose an adaptive banded parallelism alignment algorithm suitable for PIM architecture. Compared to the original dynamic programming-based alignment, the proposed algorithm significantly reduces the required complexity, data bit width, and memory footprint at the cost of negligible accuracy degradation. Then we propose the efficient PIM architecture that implements the proposed algorithm. The data flow in RAPIDx achieves four-level parallelism and we design an in-situ alignment computation flow in ReRAM, delivering $5.5$-$9.7\times$ efficiency and throughput improvements compared to our previous PIM design, RAPID. The proposed RAPIDx is reconfigurable to serve as a co-processor integrated into existing genome analysis pipeline to boost sequence alignment or edit distance calculation. On short-read alignment, RAPIDx delivers $131.1\times$ and $46.8\times$ throughput improvements over state-of-the-art CPU and GPU libraries, respectively. As compared to ASIC accelerators for long-read alignment, the performance of RAPIDx is $1.8$-$2.9\times$ higher.
翻译:基因组序列对齐是许多生物应用的核心。 测序技术的进步产生大量数据, 使序列对齐成为生物信息学分析中一个关键的瓶颈。 首先, 我们提出的调整硬件加速器在芯片内存有限, 数据移动费用昂贵, 以及优化调整算法不善, 它们无法同时处理由测序机产生的大量数据。 在本文中, 我们提出一个基于 ReRAM 的加速器, RAPIDx, 用于序列对齐。 RAPIDx通过软件硬件硬件软件软件软件软件软件设计共同设计, 实现了更高的效率和性能。 首先, 我们提出的调适的带式平行协调算法, 适合PIM结构。 与最初的动态编程匹配, 大大降低了所需的复杂性、 数据比特宽度和记忆足迹, 代价微不足道的精确降解。 然后, 我们提出一个高效的 PIMLIM 结构, RAPIDx 实现四级平行, 我们设计一个较短的比值计算流程, 用于 RARM- RISx 的 Ralal- deal Ralalalalalalalal 。