This paper proposes an event-driven solution to genotype imputation, a technique used to statistically infer missing genetic markers in DNA. The work implements the widely accepted Li and Stephens model, primary contributor to the computational complexity of modern x86 solutions, in an attempt to determine whether further investigation of the application is warranted in the event-driven domain. The model is implemented using graph-based Hidden Markov Modeling and executed as a customized forward/backward dynamic programming algorithm. The solution uses an event-driven paradigm to map the algorithm to thousands of concurrent cores, where events are small messages that carry both control and data within the algorithm. The design of a single processing element is discussed. This is then extended across multiple FPGAs and executed on a custom RISC-V NoC FPGA cluster called POETS. Results demonstrate how the algorithm scales over increasing hardware resources and a 48 FPGA run demonstrates a 270X reduction in wall-clock processing time when compared to a single-threaded x86 solution. Optimisation of the algorithm via linear interpolation is then introduced and tested, with results demonstrating a wall-clock reduction time of approx. 5 orders of magnitude when compared to a similarly optimised x86 solution.
翻译:本文提出一种对基因型估算法的由事件驱动的解决方案,这是一种用于对DNA中缺失的基因标记进行统计推算的技术。工作采用广泛接受的李氏和斯蒂芬模式,这是现代x86解决方案计算复杂性的主要促成因素,目的是确定是否有必要在事件驱动域对应用程序进行进一步调查。模型采用基于图形的隐藏Markov模型,并作为定制的前向/后向动态编程算法加以执行。解决方案使用由事件驱动的模式,将算法映射为数千个同时存在的核心,其中事件为在算法中带有控制和数据的小信息。然后讨论单一处理要素的设计,然后在多个FPGA中推广,然后在自定义的RISC-V NOC NFGA GA POETS集群上执行。结果显示,在增加硬件资源和48 FPGA运行的算法尺度显示,与单读的x86解决方案相比,倒时钟处理时间减少了270X。随后引入并测试了通过线性内插法对算法进行优化。