A typical compiler flow relies on a uni-directional sequence of translation/optimization steps that lower the program abstract representation, making it hard to preserve higher-level program information across each transformation step. On the other hand, modern ISA extensions and hardware accelerators can benefit from the compiler's ability to detect and raise program idioms to acceleration instructions or optimized library calls. Although recent works based on Multi-Level IR (MLIR) have been proposed for code raising, they rely on specialized languages, compiler recompilation, or in-depth dialect knowledge. This paper presents Source Matching and Rewriting (SMR), a user-oriented source-code-based approach for MLIR idiom matching and rewriting that does not require a compiler expert's intervention. SMR uses a two-phase automaton-based DAG-matching algorithm inspired by early work on tree-pattern matching. First, the idiom Control-Dependency Graph (CDG) is matched against the program's CDG to rule out code fragments that do not have a control-flow structure similar to the desired idiom. Second, candidate code fragments from the previous phase have their Data-Dependency Graphs (DDGs) constructed and matched against the idiom DDG. Experimental results show that SMR can effectively match idioms from Fortran (FIR) and C (CIL) programs while raising them as BLAS calls to improve performance.
翻译:典型的编译器流程依赖于一个单向的翻译/优化步骤序列,这些步骤降低了程序的抽象代表度,使得很难保存每个转换步骤的更高级别的程序信息。 另一方面,现代的ISA扩展和硬件加速器可以从编译器检测和提升程序调频以加速指示或优化图书馆呼叫的能力中受益。虽然已经提议了基于多级 IR (MLIR) 的近期工作来进行代码添加,但它们依赖于专门语言、编译器重新拼凑或深度方言知识。 本文展示了源代码匹配和重写( SMR ), 这是一种面向用户的基于源代码的方法, 用于MLIR didom 匹配和重写, 不需要编译器专家的干预。 SMR 使用两阶段的基于自动图的 DAG- 匹配算法, 受树型匹配的早期工作启发。 首先, idrom condition- deconditional 图表( CDG) 与 CDG (C- decentral) 匹配的代码碎片, 与前一阶段的IM 和后级 IM 演示阶段显示的IDDDDDDR 相似, 和后级 显示前的IDRDR) 和后端 和后端 的 级 和后级 级 级 级 级 。