This paper introduces a sampling-based strategy synthesis algorithm for nondeterministic hybrid systems with complex continuous dynamics under temporal and reachability constraints. We view the evolution of the hybrid system as a two-player game, where the nondeterminism is an adversarial player whose objective is to prevent achieving temporal and reachability goals. The aim is to synthesize a winning strategy -- a reactive (robust) strategy that guarantees the satisfaction of the goals under all possible moves of the adversarial player. The approach is based on growing a (search) game-tree in the hybrid space by combining a sampling-based planning method with a novel bandit-based technique to select and improve on partial strategies. We provide conditions under which the algorithm is probabilistically complete, i.e., if a winning strategy exists, the algorithm will almost surely find it. The case studies and benchmark results show that the algorithm is general and consistently outperforms the state of the art.
翻译:本文介绍一种基于采样的策略合成算法,用于具有复杂连续动态的非确定性混合系统,在时间和可达性限制下。我们将混合系统的演变视为一个两个玩家的游戏,其中非确定性是一个对手玩家,其目标是防止实现时间和可达性目标。这种方法是基于在混合空间中增长(搜索)游戏树的策略,通过将基于采样的规划方法与一种新颖的类赌博机的技术相结合来选择和改进部分策略。我们提供了一些条件,使得该算法在概率上是完整的,即如果存在赢取策略,则该算法几乎可以确定找到它。案例研究和基准结果表明,该算法是通用的,并且始终优于现有技术水平。