优化有防爆缓冲器的超级计算机的工作时间安排 (Optimisation of job scheduling for supercomputers with burst buffers)

from arxiv, Master's thesis in computer science supervised by Krzysztof Rzadca; Base work to Euro-Par 2021 publication: Plan-based Job Scheduling for Supercomputers with Shared Burst Buffers (arXiv:2109.00082); Source code: https://github.com/jankopanski/Burst-Buffer-Scheduling

The ever-increasing gap between compute and I/O performance in HPC platforms, together with the development of novel NVMe storage devices (NVRAM), led to the emergence of the burst buffer concept - an intermediate persistent storage layer logically positioned between random-access main memory and a parallel file system. Since the appearance of this technology, numerous supercomputers have been equipped with burst buffers exploring various architectures. Despite the development of real-world architectures as well as research concepts, Resource and Job Management Systems, such as Slurm, provide only marginal support for scheduling jobs with burst buffer requirements. This research is primarily motivated by the alerting observation that burst buffers are omitted from reservations in the procedure of backfilling in existing job schedulers. In this dissertation, we forge a detailed supercomputer simulator based on Batsim and SimGrid, which is capable of simulating I/O contention and I/O congestion effects. Due to the lack of publicly available workloads with burst buffer requests, we create a burst buffer request distribution model derived from Parallel Workload Archive logs. We investigate the impact of burst buffer reservations on the overall efficiency of online job scheduling for canonical algorithms: First-Come-First-Served (FCFS) and Shortest-Job-First (SJF) EASY-backfilling. Our results indicate that the lack of burst buffer reservations in backfilling may significantly deteriorate the performance of scheduling. [...] Furthermore, this lack of reservations may cause the starvation of medium-size and wide jobs. Finally, we propose a burst-buffer-aware plan-based scheduling algorithm with simulated annealing optimisation, which improves the mean waiting time by over 20% and mean bounded slowdown by 27% compared to the SJF EASY-backfilling.

翻译：HPC平台的计算和 I/O 性能之间日益扩大的差距,加上开发了新型 NVME 存储装置(NRRAM),导致出现爆发缓冲概念----一个中间的持久性存储层,在随机访问主内存和平行文件系统之间逻辑定位。自这一技术出现以来,许多超级计算机都配备了探索各种结构的爆裂缓冲。尽管开发了现实世界架构以及研究概念,Slurm等资源与工作管理系统只能为具有突发缓冲要求的布局工作提供少量支持。这一研究的动因是警报性观测,即从现有工作调度员的回补程序中保留了缓冲,而从中省略漏掉了缓冲值。在这项披露中,我们根据Batsim 和 SimGrid 的预设,设计了详细的超级计算机模拟器,可以模拟I/O 争议和 I/O 拥堵效应。由于缺少基于突发缓冲请求的公开工作量,我们创建了一个来自平行工作档案日日志的缓冲缓冲分配模式。我们调查了缓冲缓冲的缓冲缓冲缓冲缓冲缓冲的缓冲缓冲预值的缓冲预值, 预估预算的预结果的预算结果,而最终导致了S-FCFS-S-S-S-S-ralalalalalalalalalalalalal-al-al-al-levental-lation-al-lateal-al-al-loral-sal-sal-sal-sal-lation-lation-salvixxal-salxxxxxxxxxxxxxxxxxxxxxxxxal-slgal-sal-slgal-slgal-sal-sal-sal-sal-l