Deep Neural Networks (DNNs) have been widely used to perform real-world tasks in cyber-physical systems such as Autonomous Diving Systems (ADS). Ensuring the correct behavior of such DNN-Enabled Systems (DES) is a crucial topic. Online testing is one of the promising modes for testing such systems with their application environments (simulated or real) in a closed loop taking into account the continuous interaction between the systems and their environments. However, the environmental variables (e.g., lighting conditions) that might change during the systems' operation in the real world, causing the DES to violate requirements (safety, functional), are often kept constant during the execution of an online test scenario due to the two major challenges: (1) the space of all possible scenarios to explore would become even larger if they changed and (2) there are typically many requirements to test simultaneously. In this paper, we present MORLOT (Many-Objective Reinforcement Learning for Online Testing), a novel online testing approach to address these challenges by combining Reinforcement Learning (RL) and many-objective search. MORLOT leverages RL to incrementally generate sequences of environmental changes while relying on many-objective search to determine the changes so that they are more likely to achieve any of the uncovered objectives. We empirically evaluate MORLOT using CARLA, a high-fidelity simulator widely used for autonomous driving research, integrated with Transfuser, a DNN-enabled ADS for end-to-end driving. The evaluation results show that MORLOT is significantly more effective and efficient than alternatives with a large effect size. In other words, MORLOT is a good option to test DES with dynamically changing environments while accounting for multiple safety requirements.
翻译:深神经网络(DNN)被广泛用于在诸如自动潜水系统(ADS)等网络物理系统中执行真实世界任务。确保这种DNN- Enabled系统(DES)的正确行为是一个至关重要的主题。在线测试是一个充满希望的模式,可以在一个闭路循环中用其应用环境(模拟或真实)测试这些系统,同时考虑到这些系统及其环境之间的持续互动。然而,环境变量(例如照明条件)在系统在现实世界运行期间可能会发生变化,从而导致DES违反要求(安全、功能),在实施在线测试情景期间往往保持不变。由于以下两大挑战:(1) 所有可能的情景在改变时会变得更加大范围探索空间,(2) 通常有许多同时测试要求。在本论文中,我们介绍MOLOT(在线测试的Man-Obtive强化学习),一种应对这些挑战的新型在线测试方法,将强化学习(RL)和许多目标搜索相结合。MORT将RL用于快速搜索,同时大幅使用RL测试RL进行快速搜索。