With the rapid progress of deep learning and large language models (LLMs), companies now spend enormous sums executing GPU kernels. These kernels have, therefore, become prime targets for aggressive optimization. Recent efforts increasingly leverage LLMs to generate GPU kernels, but make no formal guarantees about the generated kernels. We present the first equivalence checker for GPU kernels and use it to formally verify the correctness of machine learning (ML) kernels optimized by hand, by LLMs, and by compilers. We show that our equivalence checker is sound and, for a well-defined class of GPU kernels which includes the programs of interest, complete. Our implementation, VOLTA, can verify ML computations such as convolutions, matrix multiplications, and various attention mechanisms.
翻译:随着深度学习和大型语言模型(LLMs)的快速发展,企业在执行GPU内核上投入了巨额资金。因此,这些内核已成为激进优化的主要目标。近期研究越来越多地利用LLMs生成GPU内核,但未对生成内核提供形式化保证。我们提出了首个GPU内核等价性验证器,并利用它形式化验证了通过手工、LLMs以及编译器优化的机器学习(ML)内核的正确性。我们证明该等价性验证器是可靠的,并且对于一类明确定义的GPU内核(包含目标程序)是完备的。我们的实现VOLTA能够验证卷积、矩阵乘法及多种注意力机制等机器学习计算。