实现多面优化的强化学习环境 (A Reinforcement Learning Environment for Polyhedral Optimizations)

The polyhedral model allows a structured way of defining semantics-preserving transformations to improve the performance of a large class of loops. Finding profitable points in this space is a hard problem which is usually approached by heuristics that generalize from domain-expert knowledge. Existing problem formulations in state-of-the-art heuristics depend on the shape of particular loops, making it hard to leverage generic and more powerful optimization techniques from the machine learning domain. In this paper, we propose PolyGym, a shape-agnostic formulation for the space of legal transformations in the polyhedral model as a Markov Decision Process (MDP). Instead of using transformations, the formulation is based on an abstract space of possible schedules. In this formulation, states model partial schedules, which are constructed by actions that are reusable across different loops. With a simple heuristic to traverse the space, we demonstrate that our formulation is powerful enough to match and outperform state-of-the-art heuristics. On the Polybench benchmark suite, we found transformations that led to a speedup of 3.39x over LLVM O3, which is 1.83x better than the speedup achieved by ISL. Our generic MDP formulation enables using reinforcement learning to learn optimization policies over a wide range of loops. This also contributes to the emerging field of machine learning in compilers, as it exposes a novel problem formulation that can push the limits of existing methods.

翻译：多元结构模型可以有条不紊地定义语义保存转换, 以改善大型循环周期的性能。在这个空间找到有利可图的点是一个棘手的问题, 通常由从域专家知识中概括的超自然学处理。最先进的超自然学现有问题配方取决于特定循环的形状, 使得很难从机器学习域中利用通用和更强大的优化技术。在本文中, 我们提议将多面体模型中法律转换空间的形状- 创新配方作为马尔科夫决定程序( MPD) 。该配方不是使用转换, 而是基于可能的时间表的抽象空间。在这种配方中, 标出由不同循环中可再使用的行动构建的模型部分时间表。有了简单的超自然学来绕过空间, 我们证明我们的配方配方足以匹配和超越现有新颖的基体格问题。在聚体基准套件中, 我们发现这种配方的变型, 而不是利用3. 39x 模型的缩放速度, 也使得我们通过LLLLA 的升级学习系统, 的升级为MLALA 的升级。