Deep Learning (DL) is prevalently used in various industries to improve decision-making and automate processes, driven by the ever-evolving DL libraries and compilers. The correctness of DL systems is crucial for trust in DL applications. As such, the recent wave of research has been studying the automated synthesis of test-cases (i.e., DNN models and their inputs) for fuzzing DL systems. However, existing model generators only subsume a limited number of operators, for lacking the ability to pervasively model operator constraints. To address this challenge, we propose NeuRI, a fully automated approach for generating valid and diverse DL models composed of hundreds of types of operators. NeuRI adopts a three-step process: (i) collecting valid and invalid API traces from various sources; (ii) applying inductive program synthesis over the traces to infer the constraints for constructing valid models; and (iii) performing hybrid model generation by incorporating both symbolic and concrete operators concolically. Our evaluation shows that NeuRI improves branch coverage of TensorFlow and PyTorch by 51% and 15% over the state-of-the-art. Within four months, NeuRI finds 87 new bugs for PyTorch and TensorFlow, with 64 already fixed or confirmed, and 8 high-priority bugs labeled by PyTorch, constituting 10% of all high-priority bugs of the period. Additionally, open-source developers regard error-inducing models reported by us as "high-quality" and "common in practice".
翻译:深学习( DL) 在许多行业中被普遍用于改善决策和自动化进程。 在不断发展的 DL 图书馆和编译者的驱动下, 深度学习( DL) 被广泛用于改善决策和自动化进程。 DL 系统的正确性对于信任 DL 应用程序至关重要。 因此, 最近一波研究浪潮一直在研究测试案例( 即 DNN 模型及其投入) 的自动合成, 用于模糊 DL 系统。 然而, 现有的模型生成器仅包含数量有限的操作员, 缺乏无处不在的模型操作员能力。 为了应对这一挑战, 我们建议 NeuRI 采用完全自动化的方法, 生成由数百种操作者组成的有效和多样化的 DL 模型。 NeuRI 采用了一个三步进程:(i) 从各种来源收集有效且无效的 API 轨迹( 即 DNNNNN 模型及其投入 ) 的合成过程。 (iii) 使用混合模型生成者数量有限, 因为他们缺乏无处的模型操作者具有开放性。 我们的评估显示, Neusorflow Flow 和PyTyTorch 模型的分支范围的覆盖范围( 51% 和15%) 和15% 高层的错误) 和高位 。