High-performance computing (HPC) is undergoing significant changes. Next generation HPC systems are equipped with diverse global and local resources, such as I/O burst buffer resources, memory resources (e.g., on-chip and off-chip RAM, external RAM/NVRA), network resources, and possibly other resources. Job schedulers play a crucial role in efficient use of resources. However, traditional job schedulers are single-objective and fail to efficient use of other resources. In this paper, we propose ROME, a novel multi-dimensional job scheduling framework to explore potential tradeoffs among multiple resources and provides balanced scheduling decision. Our design leverages genetic algorithm as the multi-dimensional optimization engine to generate fast scheduling decision and to support effective resource utilization.
翻译:高性能计算(HPC)正在经历重大变革,下一代高常委会系统配备了各种全球和地方资源,如I/O爆裂缓冲资源、记忆资源(例如芯片和芯片以外的内存资源、外部RAM/NVRA)、网络资源,以及可能还有其他资源。工作调度员在有效利用资源方面发挥着关键作用。然而,传统工作调度员是单一目标,没有有效地利用其他资源。在本文件中,我们提议了ROME,这是一个新的多维工作调度框架,以探索多种资源之间的潜在权衡,并提供平衡的时间安排决定。我们的设计利用基因算法作为多维优化引擎,以产生快速的时间安排决定,支持有效利用资源。