利用行动误解(技术报告)进行可接触运动会的欺骗性战略综述 (Synthesis of Deceptive Strategies in Reachability Games with Action Misperception (Technical Report))

Strategic deception is an act of manipulating the opponent's perception to gain strategic advantages. In this paper, we study synthesis of deceptive winning strategies in two-player turn-based zero-sum reachability games on graphs with one-sided incomplete information of action sets. In particular, we consider the class of games in which Player 1 (P1) starts with a non-empty set of private actions, which she may 'reveal' to Player 2 (P2) during the course of the game. P2 is equipped with an inference mechanism using which he updates his perception of P1's action set whenever a new action is revealed. Under this information structure, the objective of P1 is to reach a set of goal states in the game graph while that of P2 is to prevent it. We address the question: how can P1 leverage her information advantages to deceive P2 into choosing actions that in turn benefit P1? To this end, we introduce a dynamic hypergame model to capture the reachability game with evolving misperception of P2. Analyzing the game qualitatively, we design algorithms to synthesize deceptive sure and almost-sure winning regions, and establish two key results: (1) under sure-winning condition, deceptive winning strategy is equivalent to the non-deceptive winning strategy - i.e. use of deception has no advantages, (2) under almost-sure winning condition, the deceptive winning strategy could be more powerful than the non-deceptive strategy. We illustrate our algorithms using a capture-the-flag game, and demonstrate the use of proposed approach to a larger class of games with temporal logic objectives.

翻译：战略欺骗是操纵对手的感知以获得战略优势的行为。在本文中, 我们研究将欺骗性赢取策略合成为双玩者交替零和零和可达性游戏, 在带有片面动作组合不完整信息的图表中进行。特别是, 我们考虑玩家 1 (P1) 以非空的私人动作开始的游戏类型, 在游戏过程中, 她可能会“ 迷倒” 给玩家 2 (P2) 。 P2 配备了一个推论机制, 用来在出现新动作时更新他对 P1 动作集的感知。在这个信息结构中, P1 的目标是在游戏图中达到一组目标状态, 而 P2 则是防止它。我们处理的问题是: P1 如何利用她的信息优势来欺骗 P2 选择一个动作, 而这对玩家 2 (P2) 来说, 我们引入一个动态的超级游戏模型, 以不断演化的游戏质量, 我们设计算算法, 来将不感错的游戏组合的游戏组合组合组合化为确定性, 和几乎确定性策略的逻辑定位性策略的优势。在赢取策略下, 赢取策略下, 将一个不赢取策略的策略的策略中, 显示一个不赢取的策略的策略的策略是不赢取取取取取的策略, 。。