Epistasis is a phenomenon in which a phenotype outcome is determined by the interaction of genetic variation at two or more loci and it cannot be attributed to the additive combination of effects corresponding to the individual loci. Although it has been more than 100 years since William Bateson introduced this concept, it still is a topic under active research. Locating epistatic interactions is a computationally expensive challenge that involves analyzing an exponentially growing number of combinations. Authors in this field have resorted to a multitude of hardware architectures in order to speed up the search, but little to no attention has been paid to the vector instructions that current CPUs include in their instruction sets. This work extends an existing third-order exhaustive algorithm to support the search of epistasis interactions of any order and discusses multiple SIMD implementations of the different functions that compose the search using Intel AVX Intrinsics. Results using the GCC and the Intel compiler show that the 512-bit explicit vector implementation proposed here performs the best out of all of the other implementations evaluated. The proposed 512-bit vectorization accelerates the original implementation of the algorithm by an average factor of 7 and 12, for GCC and the Intel Compiler, respectively, in the scenarios tested.
翻译:Epistasis 是一种由两个或两个以上地点的基因变异相互作用决定的苯型结果的现象,这种结果不能归结于对个别地产的影响的累加组合。虽然威廉·贝泰森提出这个概念已有100多年,但它仍然是一个积极研究的专题。 分配类同性互动是一个计算成本高昂的挑战,它涉及分析成倍增长的组合数。 该领域的作者为了加快搜索速度,诉诸了多种硬件结构,但很少注意目前CPU在其教学集中包含的矢量指示。 这项工作扩展了现有的第三级详尽算法,以支持对任何顺序的粘合性相互作用的搜索,并讨论利用Intel AVX Intrinsics进行搜索的不同功能的多种SIMD实施情况。 使用海合会和 Intel 汇编器的结果显示,此处提议的512比特明确矢量的矢量执行是所评估的所有其他执行中的最佳结果。 拟议的512比特矢量控制法加速了最初在海合会第7和12个平均系数中分别测试的计算结果。