In adversarial patrolling games, a mobile Defender strives to discover intrusions at vulnerable targets initiated by an Attacker. The Attacker's utility is traditionally defined as the probability of completing an attack, possibly weighted by target costs. However, in many real-world scenarios, the actual damage caused by the Attacker depends on the \emph{time} elapsed since the attack's initiation to its detection. We introduce a formal model for such scenarios, and we show that the Defender always has an \emph{optimal} strategy achieving maximal protection. We also prove that \emph{finite-memory} Defender's strategies are sufficient for achieving protection arbitrarily close to the optimum. Then, we design an efficient \emph{strategy synthesis} algorithm based on differentiable programming and gradient descent.
翻译:在对抗性巡逻游戏中,机动卫士努力发现攻击者发动的脆弱目标受到侵入。攻击者的效用传统上被定义为完成攻击的概率,可能按目标成本加权。然而,在许多现实世界情景中,攻击者造成的实际损害取决于攻击者发动攻击到发现为止所经过的时间。我们为这种情形引入了一个正式的模式,我们显示,保护者始终有一个实现最大保护的emph{最佳}战略。我们也证明,攻击者的战略足以实现任意接近最佳程度的保护。然后,我们设计一个基于不同编程和梯度的高效的meph{strategy 合成算法。