Multi-level intermediate representations (MLIR) show great promise for reducing the cost of building domain-specific compilers by providing a reusable and extensible compiler infrastructure. This work presents TPU-MLIR, an end-to-end compiler based on MLIR that deploys pre-trained neural network (NN) models to a custom ASIC called a Tensor Processing Unit (TPU). TPU-MLIR defines two new dialects to implement its functionality: 1. a Tensor operation (TOP) dialect that encodes the deep learning graph semantics and independent of the deep learning framework and 2. a TPU kernel dialect to provide a standard kernel computation on TPU. A NN model is translated to the TOP dialect and then lowered to the TPU dialect for different TPUs according to the chip's configuration. We demonstrate how to use the MLIR pass pipeline to organize and perform optimization on TPU to generate machine code. The paper also presents a verification procedure to ensure the correctness of each transform stage.
翻译:多层次中间代表机构(MLIR)通过提供可重复使用和可扩展的编译基础设施,为降低建设特定域的编译员的成本展现出巨大的希望。这项工作展示了TPU-MLIR,这是基于MLIR的端到端的编译器,将预先培训的神经网络模型部署到称为Tensor处理股(TPU)的定制ASIC系统。TPU-MLIR定义了两种新的方言来实施其功能:1. 一种是记录深层学习图解语和独立于深层学习框架的代言(TOP)方言(TOP),和2. 一种是提供在TPU上标准内核计算法的双核方言。 NN模型被翻译成TOP方言,然后根据芯片的配置,降为不同的TPU方言方言。我们演示如何使用MLIR通过管道在TPU上组织和实施优化生成机器代码。文件还提出了一个核查程序,以确保每个转换阶段的正确性。