The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for diverse DL hardware as output. However, none of the existing survey has analyzed the unique design architecture of the DL compilers comprehensively. In this paper, we perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details, with emphasis on the DL oriented multi-level IRs, and frontend/backend optimizations. Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects. In addition, we present detailed analysis on the design of multi-level IRs and illustrate the commonly adopted optimization techniques. Finally, several insights are highlighted as the potential research directions of DL compiler. This is the first survey paper focusing on the design architecture of DL compilers, which we hope can pave the road for future research towards DL compiler.
翻译:在不同的DL硬件中运用各种深层次学习(DL)模型的困难,促使社区对DL汇编器进行研究和开发,从Tensorflow XLA和TVM等行业和学术界提出了若干DL汇编器。同样,DL汇编器将不同的DL框架描述的DL模型作为投入,然后为不同的DL硬件生成最佳代码作为输出。然而,现有的调查没有一份对DL汇编器的独特设计结构进行了全面分析。在本文件中,我们对现有DL汇编器进行了全面调查,分解了共同采用的详细设计,重点是面向DL的多级IRs以及前端/后端优化。具体地说,我们从各方面对现有DL汇编器进行综合比较。此外,我们详细分析了多层次的IRs设计,并介绍了常用的优化技术。最后,一些见解被强调为DL汇编器的潜在研究方向。这是第一份调查文件,侧重于DL编译器的设计结构,我们希望为未来的研究铺路。