The complexity of combustion simulations demands the latest high-performance computing tools to accelerate its time-to-solution results. A current trend on HPC systems is the utilization of CPUs with SIMD or vector extensions to exploit data parallelism. Our work proposes a strategy to improve the automatic vectorization of finite element-based scientific codes. The approach applies a parametric configuration to the data structures to help the compiler detect the block of codes that can take advantage of vector computation while maintaining the code portable. A detailed analysis of the computational impact of this methodology on the different stages of a CFD solver is studied on the PRECCINSTA burner simulation. Our parametric implementation has proven to help the compiler generate more vector instructions in the assembly operation: this results in a reduction of up to 9.3 times of the total executed instruction maintaining constant the Instructions Per Cycle and the CPU frequency. The proposed strategy improves the performance of the CFD case under study up to 4.67 times on the MareNostrum 4 supercomputer.
翻译:燃烧模拟的复杂性要求最新的高性能计算工具加速其时间到溶解结果。 HPC系统目前的趋势是利用SIMD或矢量扩展的CPU利用SIMD或矢量扩展来利用数据平行。我们的工作提出了一个战略来改进基于元素的有限科学代码的自动矢量化。该方法对数据结构采用参数配置,以帮助数据结构编制者检测能够利用矢量计算同时保持便携式代码的一组代码。在PRECCINSTA燃烧器模拟中详细分析了这一方法对CFD解答器不同阶段的计算影响。我们的参数应用已证明有助于编译器在组装操作中生成更多的矢量指示:这导致将总执行指令的常数“每周期指令”和“计算机频率”减少到9.3倍。拟议战略将正在研究的CFD案件在马雷诺斯鲁姆4超级计算机上的性能提高到4.67倍。