The Liquid Argon Time Projection Chamber (LArTPC) technology plays an essential role in many current and future neutrino experiments. Accurate and fast simulation is critical to developing efficient analysis algorithms and precise physics model projections. The speed of simulation becomes more important as Deep Learning algorithms are getting more widely used in LArTPC analysis and their training requires a large simulated dataset. Heterogeneous computing is an efficient way to delegate computing-heavy tasks to specialized hardware. However, as the landscape of the compute accelerators is evolving fast, it becomes more and more difficult to manually adapt the code constantly to the latest hardware or software environments. A solution which is portable to multiple hardware architectures while not substantially compromising performance would be very beneficial, especially for long-term projects such as the LArTPC simulations. In search of a portable, scalable and maintainable software solution for LArTPC simulations, we have started to explore high-level portable programming frameworks that support several hardware backends. In this paper, we will present our experience porting the LArTPC simulation code in the Wire-Cell toolkit to NVIDIA GPUs, first with the CUDA programming model and then with a portable library called Kokkos. Preliminary performance results on NVIDIA V100 GPUs and multi-core CPUs will be presented, followed by a discussion of the factors affecting the performance and plans for future improvements.
翻译:液压进时投影室(LARTPC)技术在目前和未来的许多中微子实验中发挥着关键作用。准确和快速的模拟对于制定高效的分析算法和精确的物理模型预测至关重要。随着深学习算法在LARTPC的分析及其培训中日益广泛使用,模拟速度变得更为重要。异质计算是将计算超重任务下放给专门硬件的有效方式。然而,由于计算加速器的景观正在快速发展,不断将代码手工调整到最新的硬件或软件环境就变得越来越困难了。一个既可移植到多个硬件结构的可操作算法和精确物理模型预测的解决方案将非常有益,特别是对于LARTPC的分析及其培训等长期项目来说。为了为LARTPC模拟寻找一个可扩展和可维护的软件解决方案,我们已开始探索支持若干硬件后端的高级移动编程框架。在本文中,我们将首次将LARTPC的100号代码移植到多个硬件环境中,同时不大幅降低性能,对于LARTPC的CUFA模拟规划结果,然后用CPUGRO-KC工具,我们将把C-CMISUDIA模拟计划与C-RODUDI-ROG-RODI-RODMA的模拟结果输入结果与C-C-C-C-ROPI-RODIPIPLMDMDMDMTMTFTFTFMDMDMTFTFTFTFTFT结果的第一个经验输入。