The ever-growing processing power of supercomputers in recent decades enables us to explore increasing complex scientific problems. Effective scheduling these jobs is crucial for individual job performance and system efficiency. The traditional job schedulers in high performance computing (HPC) are simple and concentrate on improving CPU utilization. The emergence of new hardware resources and novel hardware structure impose severe challenges on traditional schedulers. The increasing diverse workloads, including compute-intensive and data-intensive applications, require more efficient schedulers. Even worse, the above two factors interplay with each other, which makes scheduling problem even more challenging. In recent years, many research has discussed new scheduling methods to combat the problems brought by rapid system changes. In this research study, we have investigated challenges faced by HPC scheduling and state-of-art scheduling methods to overcome these challenges. Furthermore, we propose an intelligent scheduling framework to alleviate the problems encountered in modern job scheduling.
翻译:近几十年来,超级计算机的不断增长的处理能力使我们得以探索日益复杂的科学问题。有效安排这些职位对于个人工作业绩和系统效率至关重要。高性能计算的传统工作调度员很简单,并且集中致力于改进CPU的利用。新的硬件资源和新型硬件结构的出现给传统调度员带来了严峻的挑战。包括计算密集和数据密集型应用程序在内的各种工作量不断增加,需要更高效的调度员。更糟糕的是,上述两个因素相互作用,使得时间安排问题变得更加棘手。近年来,许多研究讨论了应对快速系统变化带来的问题的新的时间安排方法。在这项研究中,我们研究了HPC时间安排和最先进的时间安排方法所面临的挑战,以克服这些挑战。此外,我们提出了一个明智的时间安排框架,以缓解现代工作时间安排中遇到的问题。