Deep Learning (DL) components are routinely integrated into software systems that need to perform complex tasks such as image or natural language processing. The adequacy of the test data used to test such systems can be assessed by their ability to expose artificially injected faults (mutations) that simulate real DL faults. In this paper, we describe an approach to automatically generate new test inputs that can be used to augment the existing test set so that its capability to detect DL mutations increases. Our tool DeepMetis implements a search based input generation strategy. To account for the non-determinism of the training and the mutation processes, our fitness function involves multiple instances of the DL model under test. Experimental results show that \tool is effective at augmenting the given test set, increasing its capability to detect mutants by 63% on average. A leave-one-out experiment shows that the augmented test set is capable of exposing unseen mutants, which simulate the occurrence of yet undetected faults.
翻译:深学习( DL) 组件通常被整合到需要执行复杂任务的软件系统中, 如图像或自然语言处理等 。 测试这些系统的测试数据是否充足, 可以通过测试这些系统的能力来评估。 测试数据是否充足, 可以通过它们是否有能力暴露人工注入的缺陷( 调制) 来模拟真实的 DL 差错。 在本文中, 我们描述一种自动生成新的测试投入的方法, 可以用来增强现有测试集的检测 DL 突变能力。 我们的工具 DeepMetis 实施了基于搜索的输入生成策略。 为了计算培训和突变过程的不确定性, 我们的健身功能涉及测试中DL 模型的多个实例 。 实验结果显示\ 工具对增强给定的测试集有效, 平均将检测变异体的能力提高63% 。 离线实验显示, 增强的测试集能够暴露看不见的变异体, 从而模拟尚未发现的错误的发生 。