Data Prefetching is a technique that can hide memory latency by fetching data before it is needed by a program. Prefetching relies on accurate memory access prediction, to which task machine learning based methods are increasingly applied. Unlike previous approaches that learn from deltas or offsets and perform one access prediction, we develop TransforMAP, based on the powerful Transformer model, that can learn from the whole address space and perform multiple cache line predictions. We propose to use the binary of memory addresses as model input, which avoids information loss and saves a token table in hardware. We design a block index bitmap to collect unordered future page offsets under the current page address as learning labels. As a result, our model can learn temporal patterns as well as spatial patterns within a page. In a practical implementation, this approach has the potential to hide prediction latency because it prefetches multiple cache lines likely to be used in a long horizon. We show that our approach achieves 35.67% MPKI improvement and 20.55% IPC improvement in simulation, higher than state-of-the-art Best-Offset prefetcher and ISB prefetcher.
翻译:数据预取是一种技术,可以在程序需要之前通过获取数据隐藏内存延迟。 预取取决于准确的内存存存取预测, 并越来越多地采用基于任务机器学习的方法。 与以往从三角形中学习或抵消并进行一个存取预测的方法不同, 我们开发了TransforMAP, 以强大的变异器模型为基础, 可以从整个地址空间中学习, 并进行多个缓存线预测。 我们提议使用内存地址的二进制输入模式, 以避免信息丢失, 并保存硬件中的象征性表格 。 我们设计了块索引位图, 以收集当前页面地址下未排序的未来页面折叠, 作为学习标签 。 因此, 我们的模型可以在页面中学习时间模式和空间模式 。 在实际实施中, 这种方法有可能隐藏内嵌入的内嵌入, 因为它预设了多个可能长期使用的缓存线 。 我们显示, 我们的方法在模拟中实现了35.67% MPKI的改进和20.55% IPC的改进, 高于最先进的最佳置前置和 ISBSB。