Deep Learning (DL) compilers have been widely utilized to optimize DL models for efficient deployment across various hardware. Due to their vital role in the DL ecosystem, ensuring their reliability and security is critical. However, existing approaches have limitations in testing optimization stages, which is the core functionality of DL compilers, due to the difficulty in generating optimization-aware tests. In this paper, we proposed OATest, a novel approach for synthesizing optimization-aware computational graphs. The approach combines patterns extracted from documented tests for optimization and incorporates them into seed computational graphs, enabling broader exploration of optimization paths. To guarantee the optimization-awareness of generated graphs, OATest introduces the edges reusing strategy to establish strong connections between patterns and contexts. Additionally, to solve the validity challenge for the generated graphs, OATest employs an auxiliary layers addition strategy to resolve broken constraints. Equipped with two distinct test oracles, OATest applies differential testing to evaluate the two widely used DL compilers (i.e., TVM and ONNXRuntime). Our experimental results show that OATest outperforms the state-of-the-art method by detecting more bugs and achieving higher code coverage in TVM and ONNXRutimes. Additionally, OATest uncovers 58 previously unknown bugs, 36 of which have been confirmed or fixed by developers.
翻译:深度学习(DL)编译器已被广泛应用于优化深度学习模型,以实现跨各种硬件的高效部署。由于其在深度学习生态系统中扮演着关键角色,确保其可靠性和安全性至关重要。然而,现有方法在测试优化阶段——即深度学习编译器的核心功能——方面存在局限性,这主要源于生成优化感知测试的困难。本文提出了一种新颖的方法OATest,用于合成优化感知的计算图。该方法结合了从已记录的优化测试中提取的模式,并将其融入种子计算图中,从而能够更广泛地探索优化路径。为确保生成的计算图具有优化感知能力,OATest引入了边重用策略,以在模式与上下文之间建立强连接。此外,为解决生成计算图的有效性挑战,OATest采用辅助层添加策略来修复被破坏的约束条件。借助两种不同的测试预言,OATest应用差异测试来评估两种广泛使用的深度学习编译器(即TVM和ONNXRuntime)。我们的实验结果表明,OATest在TVM和ONNXRuntime中检测到更多错误并实现了更高的代码覆盖率,优于现有最先进方法。此外,OATest发现了58个先前未知的错误,其中36个已得到开发者的确认或修复。