DNA pattern matching is essential for many widely used bioinformatics applications. Disease diagnosis is one of these applications, since analyzing changes in DNA sequences can increase our understanding of possible genetic diseases. The remarkable growth in the size of DNA datasets has resulted in challenges in discovering DNA patterns efficiently in terms of run time and power consumption. In this paper, we propose an efficient hardware and software codesign that determines the chance of the occurrence of repeat-expansion diseases using DNA pattern matching. The proposed design parallelizes the DNA pattern matching task using associative memory realized with analog content-addressable memory and implements an algorithm that returns the maximum number of consecutive occurrences of a specific pattern within a DNA sequence. We fully implement all the required hardware circuits with PTM 45-nm technology, and we evaluate the proposed architecture on a practical human DNA dataset. The results show that our design is energy-efficient and significantly accelerates the DNA pattern matching task compared to previous approaches described in the literature.
翻译:疾病诊断是其中一项应用,因为分析DNA序列的变化可以增加我们对可能遗传疾病的了解。DNA数据集规模的显著增长导致在运行时间和功率消耗方面有效发现DNA模式的挑战。在本文中,我们提议一个高效的硬件和软件代码符号,用以确定使用DNA模式匹配的重复扩展疾病发生的可能性。拟议的设计将DNA模式匹配任务平行化,使用与模拟内容可处理的内存实现的联系内存,并采用一种算法,以返回DNA序列中某种特定模式的最多连续发生次数。我们用PTM 45-nm技术充分实施所有所需的硬件线路,并评估实用人类DNA数据集的拟议结构。结果显示,我们的设计是节能的,大大加快了DNA模式匹配任务与文献中描述的先前方法的进度。