This paper presents an extended version of Deeper, a search-based simulation-integrated test solution that generates failure-revealing test scenarios for testing a deep neural network-based lane-keeping system. In the newly proposed version, we utilize a new set of bio-inspired search algorithms, genetic algorithm (GA), $({\mu}+{\lambda})$ and $({\mu},{\lambda})$ evolution strategies (ES), and particle swarm optimization (PSO), that leverage a quality population seed and domain-specific cross-over and mutation operations tailored for the presentation model used for modeling the test scenarios. In order to demonstrate the capabilities of the new test generators within Deeper, we carry out an empirical evaluation and comparison with regard to the results of five participating tools in the cyber-physical systems testing competition at SBST 2021. Our evaluation shows the newly proposed test generators in Deeper not only represent a considerable improvement on the previous version but also prove to be effective and efficient in provoking a considerable number of diverse failure-revealing test scenarios for testing an ML-driven lane-keeping system. They can trigger several failures while promoting test scenario diversity, under a limited test time budget, high target failure severity, and strict speed limit constraints.
翻译:本文展示了扩大版的Deeper(Deeper),这是一个基于搜索的模拟综合测试解决方案,它生成了测试深神经网络的深神经网络车道维护系统的故障破解测试情景。在新提议的版本中,我们使用一套新的生物启发搜索算法、遗传算法(GA),$(mu ⁇ lambda)和$(thumu}, lambda})的进化战略(ES)和粒子群优化(PSO),它利用了高质量的人口种子和特定域的交叉和突变操作,适合用于模拟测试情景的演示模型。为了在深色内展示新的测试生成器的能力,我们进行了实证评估和比较,以了解五个参与计算机物理系统测试工具在SBST 2021测试竞争的结果。我们的评估显示,Deeper新提议的测试发电机不仅比前版本有很大改进,而且证明,在测试ML驱动的车道控制系统时,它们能够有效和高效地引发数量众多的多种不同的重复测试情景。在有限的时间限制下,它们可以触发若干次的测试失败,同时触发一些严格的试验情景。