Autonomous systems such as self-driving cars and general-purpose robots are safety-critical systems that operate in highly uncertain and dynamic environments. We propose an interactive multi-agent framework where the system-under-design is modeled as an ego agent and its environment is modeled by a number of adversarial (ado) agents. For example, a self-driving car is an ego agent whose behavior is influenced by ado agents such as pedestrians, bicyclists, traffic lights, road geometry etc. Given a logical specification of the correct behavior of the ego agent, and a set of constraints that encode reasonable adversarial behavior, our framework reduces the adversarial testing problem to the problem of synthesizing controllers for (constrained) ado agents that cause the ego agent to violate its specifications. Specifically, we explore the use of tabular and deep reinforcement learning approaches for synthesizing adversarial agents. We show that ado agents trained in this fashion are better than traditional falsification or testing techniques because they can generalize to ego agents and environments that differ from the original ego agent. We demonstrate the efficacy of our technique on two real-world case studies from the domain of self-driving cars.
翻译:自驾驶汽车和通用机器人等自驾驶系统是安全关键系统,在高度不确定和动态的环境中运作。我们提议一个互动多试剂框架,让系统设计以自我设计为模型,其环境由一些对抗性(ado)代理人模拟。例如,自驾驶汽车是自驾驶动力,其行为受行人、双车驾驶者、交通灯、道路几何等自来剂等自来剂的影响。鉴于自我代理的正确行为符合逻辑规格,以及一套规范合理对抗行为的制约因素,我们的框架将对抗性测试问题降低到使自来剂违反其规格的自来剂控制器合成的问题。具体地说,我们探索如何使用表式和深强化学习方法,使自来剂合成对抗性制剂。我们表明,在这种方式上受过训练的代理人比传统的伪造或测试技术要好,因为它们可以向与原始自我代理商不同的自我代理和环境进行普及。我们展示了我们两个现实世界案例研究的功效。