This paper is concerned with the optimal allocation of detection resources (sensors) to mitigate multi-stage attacks, in the presence of the defender's uncertainty in the attacker's intention. We model the attack planning problem using a Markov decision process and characterize the uncertainty in the attacker's intention using a finite set of reward functions -- each reward represents a type of the attacker. Based on this modeling framework, we employ the paradigm of the worst-case absolute regret minimization from robust game theory and develop mixed-integer linear program (MILP) formulations for solving the worst-case regret minimizing sensor allocation strategies for two classes of attack-defend interactions: one where the defender and attacker engage in a zero-sum game, and another where they engage in a non-zero-sum game. We demonstrate the effectiveness of our framework using a stochastic gridworld example.
翻译:暂无翻译