Pythia:一个使用在线强化学习的可定制的硬件预套框架 (Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning)

Past research has proposed numerous hardware prefetching techniques, most of which rely on exploiting one specific type of program context information (e.g., program counter, cacheline address) to predict future memory accesses. These techniques either completely neglect a prefetcher's undesirable effects (e.g., memory bandwidth usage) on the overall system, or incorporate system-level feedback as an afterthought to a system-unaware prefetch algorithm. We show that prior prefetchers often lose their performance benefit over a wide range of workloads and system configurations due to their inherent inability to take multiple different types of program context and system-level feedback information into account while prefetching. In this paper, we make a case for designing a holistic prefetch algorithm that learns to prefetch using multiple different types of program context and system-level feedback information inherent to its design. To this end, we propose Pythia, which formulates the prefetcher as a reinforcement learning agent. For every demand request, Pythia observes multiple different types of program context information to make a prefetch decision. For every prefetch decision, Pythia receives a numerical reward that evaluates prefetch quality under the current memory bandwidth usage. Pythia uses this reward to reinforce the correlation between program context information and prefetch decision to generate highly accurate, timely, and system-aware prefetch requests in the future. Our extensive evaluations using simulation and hardware synthesis show that Pythia outperforms multiple state-of-the-art prefetchers over a wide range of workloads and system configurations, while incurring only 1.03% area overhead over a desktop-class processor and no software changes in workloads. The source code of Pythia can be freely downloaded from https://github.com/CMU-SAFARI/Pythia.

翻译：过去的研究提出了许多硬件预发技术,其中多数依靠利用一种特定类型的程序背景信息(如程序柜台、缓存线地址)来预测未来的内存访问。这些技术要么完全忽视预发器在整个系统中的不良影响(如内存带宽使用),要么将系统级反馈作为后思考纳入系统-快递预发配算法。我们显示,先前的预发器通常会因为大量的工作量和系统配置而失去性能效益,因为它们本身无法将多种不同类型的程序背景和系统级反馈信息纳入考虑,同时预发盘。在本文中,我们论证设计一个全面的预发件算算,学会使用多种不同的程序背景和系统预发盘预发回反馈信息。为此,我们提议Pythia(将预发件人预发件人预发件人)只能作为增强性软件的学习工具。对于每一项需求,Pythia(将多种类型的程序背景信息用于预发件人)的内置的内置的内存反馈信息,在目前质量决定中,每个预发式的内,将预发件人机机级预发件机的内,将预发件人级预发件机的内,将自动地评估。