Magnetic tapes are often considered as an outdated storage technology, yet they are still used to store huge amounts of data. Their main interests are a large capacity and a low price per gigabyte, which come at the cost of a much larger file access time than on disks. With tapes, finding the right ordering of multiple file accesses is thus key to performance. Moving the reading head back and forth along a kilometer long tape has a non-negligible cost and unnecessary movements thus have to be avoided. However, the optimization of tape request ordering has then rarely been studied in the scheduling literature, much less than I/O scheduling on disks. For instance, minimizing the average service time for several read requests on a linear tape remains an open question. Therefore, in this paper, we aim at improving the quality of service experienced by users of tape storage systems, and not only the peak performance of such systems. To this end, we propose a reasonable polynomial-time exact algorithm while this problem and simpler variants have been conjectured NP-hard. We also refine the proposed model by considering U-turn penalty costs accounting for inherent mechanical accelerations. Then, we propose low-cost variants of our optimal algorithm by restricting the solution space, yet still yielding an accurate suboptimal solution. Finally, we compare our algorithms to existing solutions from the literature on logs of the mass storage management system of a major datacenter. This allows us to assess the quality of previous solutions and the improvement achieved by our low-cost algorithms. Aiming for reproducibility, we make available the complete implementation of the algorithms used in our evaluation, alongside the dataset of tape requests that is, to the best of our knowledge, the first of its kind to be publicly released.
翻译:磁带通常被视为一种过时的存储技术,但它们仍然用于存储大量数据。 它们的主要利益是巨大的容量和低价每千兆字节,其成本比磁盘上的文件存取时间要大得多。 因此,磁带中找到多个文件存取的正确排序是性能的关键。 将读头沿着一个千米长的磁带来回转转是不可忽略的成本和不必要的动作,因此必须避免。 但是,在时间安排文献中很少研究磁带订购的优化,远低于磁盘上的I/O时间安排。 例如,将线性磁带上若干读取请求的平均服务时间减少到最低,这仍然是个问题。 因此,我们在本文件中,用磁带存储系统的用户所经历的服务质量的正确排序,而不仅仅是这种系统的高峰性能。 我们提出一个合理的多米时精确算算法,而这个问题和最简单的变法则已经变得简单化。 我们还改进了拟议模型, 考虑用U 将惩罚成本计算系统计算出内在的磁盘磁盘磁带磁带存取的平均时间, 最终,我们用现有数据算算算算算算算法将我们现有最精确的快速的快速的存储解决方案。