Probabilistic model checking is a useful technique for specifying and verifying properties of stochastic systems including randomized protocols and the theoretical underpinnings of reinforcement learning models. However, these methods rely on the assumed structure and probabilities of certain system transitions. These assumptions may be incorrect, and may even be violated in the event that an adversary gains control of some or all components in the system. In this paper, motivated by research in adversarial machine learning on adversarial examples, we develop a formal framework for adversarial robustness in systems defined as discrete time Markov chains (DTMCs), and extend to include deterministic, memoryless policies acting in Markov decision processes (MDPs). Our framework includes a flexible approach for specifying several adversarial models with different capabilities to manipulate the system. We outline a class of threat models under which adversaries can perturb system transitions, constrained by an $\varepsilon$ ball around the original transition probabilities and define four specific instances of this threat model. We define three main DTMC adversarial robustness problems and present two optimization-based solutions, leveraging traditional and parametric probabilistic model checking techniques. We then evaluate our solutions on two stochastic protocols and a collection of GridWorld case studies, which model an agent acting in an environment described as an MDP. We find that the parametric solution results in fast computation for small parameter spaces. In the case of less restrictive (stronger) adversaries, the number of parameters increases, and directly computing property satisfaction probabilities is more scalable. We demonstrate the usefulness of our definitions and solutions by comparing system outcomes over various properties, threat models, and case studies.
翻译:概率模型检查是一种有用的技术,用于确定和核查随机规程等随机测试系统系统的特性,以及强化学习模型的理论基础。然而,这些方法依赖于某些系统过渡的假设结构和概率。这些假设可能不正确,甚至可能违反。在对立方控制系统中某些或所有组成部分的情况下,这些假设可能不正确,甚至可能违反。在对立机器学习对抗性辩论性实例的研究的推动下,我们开发了一个正式的框架,用以在被界定为离散时间参数马可夫链(DMCs)的系统中进行对抗性稳健,并扩大到包括马尔可夫决定性定义(MDPs)的确定性、无记忆性的政策。我们的框架包括一种灵活的方法,用以指定若干具有不同能力操纵系统的对抗性模型。我们概述了一种威胁性模型,在最初过渡概率的概率上,我们为这种威胁性模型的四种具体例子定义了三种主要的DTMC稳健度问题,并提出了两种基于最优化的解决方案,利用传统和对准的准确性数据模型来操纵系统;我们用一种不那么,我们用一种比较性精确性模型来测量性模型,用来核对一种对等的精确的模型,然后用两种方法来测量。我们用两种方法,我们用一种对立式的精确性模型来测量一种比较一种对立式的方法,用一种对立。