负载平衡平行追踪项目强化学习 (Reinforcement Learning for Load-balanced Parallel Particle Tracing)

We explore an online learning reinforcement learning (RL) paradigm for optimizing parallel particle tracing performance in distributed-memory systems. Our method combines three novel components: (1) a workload donation model, (2) a high-order workload estimation model, and (3) a communication cost model, to optimize the performance of data-parallel particle tracing dynamically. First, we design an RL-based workload donation model. Our workload donation model monitors the workload of processes and creates RL agents to donate particles and data blocks from high-workload processes to low-workload processes to minimize the execution time. The agents learn the donation strategy on-the-fly based on reward and cost functions. The reward and cost functions are designed to consider the processes' workload change and the data transfer cost for every donation action. Second, we propose an online workload estimation model, in order to help our RL model estimate the workload distribution of processes in future computations. Third, we design the communication cost model that considers both block and particle data exchange costs, helping the agents make effective decisions with minimized communication cost. We demonstrate that our algorithm adapts to different flow behaviors in large-scale fluid dynamics, ocean, and weather simulation data. Our algorithm improves parallel particle tracing performance in terms of parallel efficiency, load balance, and costs of I/O and communication for evaluations up to 16,384 processors.

翻译：我们探索在线学习强化学习(RL)模式,优化分布式模拟系统中平行粒子追踪绩效。我们的方法包括三个新组成部分:(1) 工作量捐赠模式,(2) 工作量高度估计模式,(3) 通信成本模式,优化数据平行粒子动态跟踪的绩效。首先,我们设计了一个基于RL的工作量捐赠模式。我们的工作量捐赠模式监测流程的工作量,并创建RL代理商,将高工作量数据交换流程中的颗粒和数据块捐赠到低工作量流程,以最大限度地减少执行时间。代理商学习基于奖励和成本功能的飞行捐赠战略。奖励和成本功能旨在考虑流程工作量变化和每项捐赠行动的数据转移成本。第二,我们提出在线工作量估算模式,以帮助我们RL模型估计未来计算流程的工作量分布。第三,我们设计通信成本模型,既考虑块和粒子数据交换成本,帮助代理商以最小化通信成本的方式做出有效决定。我们证明,我们的算法在大规模液体动态、海洋、和天气同步数据跟踪过程中,我们改进了16项的动态流动行为行为, 并改进了同步数据。