Burst-Buffering is a promising storage solution that introduces an intermediate highthroughput storage buffer layer to mitigate the I/O bottleneck problem that the current High-Performance Computing (HPC) platforms suffer. The existing Markov-Chain based probabilistic I/O scheduling utilizes the load state of Burst-Buffers and the periodical characteristics of applications to reduce I/O congestion due to the limited capacity of Burst-Buffers. However, this probabilistic approach requires consistent I/O characteristics of applications, including similar I/O duration and long application length, in order to obtain an accurate I/O load estimation. These consistency conditions do not often hold in realistic situations. In this paper, we propose a generic framework of dynamic probabilistic I/O scheduling based on application clustering (DPSAC) to make applications meet the consistency requirements. According to the I/O phrase length of each application, our scheme first deploys a one-dimensional K-means clustering algorithm to cluster the applications into clusters. Next, it calculates the expected workload of each cluster through the probabilistic model of applications and then partitions the Burst-Buffers proportionally. Then, to handle dynamic changes (join and exit) of applications, it updates the clusters based on a heuristic strategy. Finally, it applies the probabilistic I/O scheduling, which is based on the distribution of application workload and the state of Burst-Buffers, to schedule I/O for all the concurrent applications to mitigate I/O congestion. The simulation results on synthetic data show that our DPSAC is effective and efficient.
翻译:Burst-Bust-Butffer 是一个有希望的存储解决方案,它引入了一个中间高通量存储缓冲层,以缓解当前高性能计算平台(HPC)所面临的I/O瓶颈问题。基于Markov-Chain概率性 I/O的当前概率性I/O列表利用了Burst-Butffers的负荷状态和应用程序的定期特性,以减少 I/O 拥堵,因为Burst-Boffer 能力有限。然而,这种概率化方法需要各种应用程序的I/O特性一致,包括类似的I/O期限和较长的应用长度,以便获得准确的 I/O 负荷估计。这些一致性条件往往不会维持在现实情况下。在本文件中,我们基于应用程序组合(DPSAC)的动态性概率性 I/O 时间性I/O 时间性I/ O 的常规性框架框架, 显示基于其动态性应用的I/O 水平性I/O 组合的I/O 组合计算每组的预期工作量。