In this paper, we describe our approach to develop a simulation software application for the fully kinetic Vlasov equation which will be used to explore physics beyond the gyrokinetic model. Simulating the fully kinetic Vlasov equation requires efficient utilization of compute and storage capabilities due to the high dimensionality of the problem. In addition, the implementation needs to be extensibility regarding the physical model and flexible regarding the hardware for production runs. We start on the algorithmic background to simulate the 6-D Vlasov equation using a semi-Lagrangian algorithm. The performance portable software stack, which enables production runs on pure CPU as well as AMD or Nvidia GPU accelerated nodes, is presented. The extensibility of our implementation is guaranteed through the described software architecture of the main kernel, which achieves a memory bandwidth of almost 500 GB/s on a V100 Nvidia GPU and around 100 GB/s on an Intel Xeon Gold CPU using a single code base. We provide performance data on multiple node level architectures discussing utilized and further available hardware capabilities. Finally, the network communication bottleneck of 6-D grid based algorithms is quantified. A verification of physics beyond gyrokinetic theory for the example of ion Bernstein waves concludes the work.
翻译:在本文中,我们描述我们开发完全动能Vlasov方程式模拟软件应用的方法,该方程式将用于探索陀螺模型之外的物理物理。模拟完全动能Vlasov方程式需要高效使用计算和储存能力,因为问题具有高度的维度。此外,在物理模型方面需要推广,在生产运行硬件方面需要灵活性。我们从算法背景开始,使用半Lagrangian算法模拟6-D Vlasov方程式。我们提供了用于在纯CPU以及AMD或Nvidia GPU加速节点上进行生产的性能便携式软件堆。通过描述的主要内核内核软件结构保证了我们执行工作的可扩展性,在V100 Nvidia GPU上实现了近500GB/s的记忆带宽,在Intel Xeon Gold Gold CPU的基础上,使用单一的代码基数位进行约100GB/s。我们提供了多节级结构的性能数据,用于讨论使用和进一步获得的硬件能力。最后,BirgyK-D ASyal ASyal 6-stalstalstal ASyal 6-stal asticalgalgal asticalgalgalgalgalgalgalgalsalgalgalgalgalgsalgsalgsbolgs。</s>