MDPs 参数合成的 Convinx 优化 (Convex Optimization for Parameter Synthesis in MDPs)

Probabilistic model checking aims to prove whether a Markov decision process (MDP) satisfies a temporal logic specification. The underlying methods rely on an often unrealistic assumption that the MDP is precisely known. Consequently, parametric MDPs (pMDPs) extend MDPs with transition probabilities that are functions over unspecified parameters. The parameter synthesis problem is to compute an instantiation of these unspecified parameters such that the resulting MDP satisfies the temporal logic specification. We formulate the parameter synthesis problem as a quadratically constrained quadratic program (QCQP), which is nonconvex and is NP-hard to solve in general. We develop two approaches that iteratively obtain locally optimal solutions. The first approach exploits the so-called convex-concave procedure (CCP), and the second approach utilizes a sequential convex programming (SCP) method. The techniques improve the runtime and scalability by multiple orders of magnitude compared to black-box CCP and SCP by merging ideas from convex optimization and probabilistic model checking. We demonstrate the approaches on a satellite collision avoidance problem with hundreds of thousands of states and tens of thousands of parameters and their scalability on a wide range of commonly used benchmarks.

翻译：概率模型检查旨在证明Markov决定过程(MDP)是否符合时间逻辑规格。基本方法所依据的假设往往是不切实际的,即MDP是众所周知的。因此,参数 MDP(pMDPs)扩展了具有过渡概率的MDPs,而过渡概率则功能高于未指明的参数。参数合成问题是对这些未说明的参数进行即时计算,从而使由此产生的MDP符合时间逻辑规格。我们将参数合成问题作为一个四进制受限制的四进制程序(QCQP)来设计,这是一个非对立的四进制程序(QCQP),并且很难解决一般的NP。我们开发了两种迭代获得当地最佳解决方案的方法。第一种是利用所谓的convex-concave 程序(CCP),第二种是使用顺序组合程序(SCP)方法,以便计算出这些参数的速率,从而使MDP满足时间和SCP的多级,通过黑箱CCP和SCP(QQQQ)来改进运行时间和缩缩度。我们展示了卫星碰撞避免碰撞问题的方法,以千万个共同的参数和千个标准为基准。