Machine learning algorithms have shown potential to improve prefetching performance by accurately predicting future memory accesses. Existing approaches are based on the modeling of text prediction, considering prefetching as a classification problem for sequence prediction. However, the vast and sparse memory address space leads to large vocabulary, which makes this modeling impractical. The number and order of outputs for multiple cache line prefetching are also fundamentally different from text prediction. We propose TransFetch, a novel way to model prefetching. To reduce vocabulary size, we use fine-grained address segmentation as input. To predict unordered sets of future addresses, we use delta bitmaps for multiple outputs. We apply an attention-based network to learn the mapping between input and output. Prediction experiments demonstrate that address segmentation achieves 26% - 36% higher F1-score than delta inputs and 15% - 24% higher F1-score than page & offset inputs for SPEC 2006, SPEC 2017, and GAP benchmarks. Simulation results show that TransFetch achieves 38.75% IPC improvement compared with no prefetching, outperforming the best-performing rule-based prefetcher BOP by 10.44%, and ML-based prefetcher Voyager by 6.64%.
翻译:机器学习算法显示通过准确预测未来内存存取量来改进预发性能的潜力。 现有方法以文本预测模型为基础, 将预发作为序列预测的分类问题 。 但是, 广而稀少的内存地址空间导致大量词汇, 这使得这种模型不切实际。 多缓存行预发件的输出数量和顺序也与文本预测有根本的不同。 我们提出了TransFetch, 这是模型预发的新型方法 。 为了降低词汇大小, 我们使用微小的地址分割作为输入。 为了预测未经排序的未来地址, 我们使用三角比图作为多重输出的分类问题 。 我们应用基于关注的网络来学习输入和输出之间的映射。 预测实验显示, 处理分解的输出输出量达到26% - 36% F1- 核心值比三角洲输入量高, 15% - 24% 比 SPEC 2006 、 SPEC 2017 和 GAP 基准值高 。 模拟结果显示 Transferchation 将IPC 改进38. 75 %, 与不前 OP-L 6. 表现最佳规则, 通过预 预 预 预 和预 预 预 预 预 预 预 。