As Deep Learning (DL) is continuously adopted in many safety critical applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. Although recent progress has been made in designing novel testing techniques for DL software, the distribution of generated test data is not taken into consideration. It is therefore hard to judge whether the identified errors are indeed meaningful errors to the DL application. Therefore, we propose a new OOD-guided testing technique which aims to generate new unseen test cases relevant to the underlying DL system task. Our results show that this technique is able to filter up to 55.44% of error test case on CIFAR-10 and is 10.05% more effective in enhancing robustness.
翻译:由于深入学习(DL)在许多安全关键应用程序中不断被采用,其质量和可靠性开始引起关注。与传统的软件开发程序一样,在早期测试DL软件以发现其缺陷是减少部署后风险的有效方法。虽然最近在设计DL软件新测试技术方面取得了进展,但所生成的测试数据的传播却没有考虑到这一点。因此,很难判断所查明的错误是否确实是DL应用程序的有意义的错误。因此,我们提议一种新的OOD制导测试技术,目的是产生与DL系统基本任务相关的新的无形测试案例。我们的结果显示,这一技术能够过滤到CIFAR-10系统55.44%的错误测试案例,并且更能有效增强强性。