查找神经网络错误的测试案例生成 (Test-Case Generation for Finding Neural Network Bugs)

As neural networks are increasingly included as core components of safety-critical systems, developing effective testing techniques specialized for them becomes crucial. The bulk of the research has focused on testing neural-network models (for instance, their robustness and reliability as classifiers). But neural-network models are defined by writing programs (usually written in a programming language like Python), and there is growing evidence that these neural-network programs often have bugs. Thus, being able to effectively test neural-network programs is instrumental to their dependability. This paper presents aNNoTest: an approach to generating test inputs for neural-network programs. A fundamental challenge is that the dynamically-typed languages used to program neural networks cannot express detailed constraints about valid function inputs. Without knowing these constraints, automated test-case generation is prone to producing many invalid inputs, which trigger spurious failures and are useless for identifying real bugs. To address this problem, we introduce a simple annotation language tailored for expressing valid function inputs in neural-network programs. aNNoTest inputs an annotated program, and uses property-based testing to generate random inputs that satisfy the validity constraints. In the paper, we also outline guidelines that help reduce the effort needed to write aNNoTest annotations. We evaluated aNNoTest on 19 neural-network programs from Islam et al.'s survey. aNNoTest automatically generated test inputs that revealed 94 bugs, including 63 bugs that the survey reported for these projects. These results suggest that aNNoTest can be a cost-effective approach to finding widespread bugs in neural-network programs.

翻译：由于神经网络日益被纳入安全临界系统的核心组成部分,因此开发有效的神经网络系统变得至关重要。大部分研究侧重于测试神经网络模型(例如,作为分类器的坚固性和可靠性)。但是神经网络模型则由写程序(通常以像Python这样的编程语言写成)来定义。越来越多的证据表明,这些神经网络程序往往有错误。因此,能够有效地测试神经网络程序对于其可靠性至关重要。本文展示了一个“NNoTest:为神经网络程序生成测试投入的方法 ” 。一个基本的挑战是,用于编程神经网络模型的动态型语言无法表达关于有效功能投入的详细限制。但是,在不了解这些限制的情况下,自动测试型网络模型的生成很容易产生许多无效的投入,这会引起虚假的失败,对识别真正的错误是毫无用处的。为了解决这个问题,我们引入了一种简单的注释语言,用于在神经网络程序中表达有效的功能投入。一个附加说明的程序,并且使用基于属性的测试来生成随机输入结果, 包括不透明的网络测试程序。在不理解这些限制的情况下, 测试型测试型的生成一个文件,我们也可以用来评估一个基于内部测试程序。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日