We introduce stochastic decision Petri nets (SDPNs), which are a form of stochastic Petri nets equipped with rewards and a control mechanism via the deactivation of controllable transitions. Such nets can be translated into Markov decision processes (MDPs), potentially leading to a combinatorial explosion in the number of states due to concurrency. Hence we restrict ourselves to instances where nets are either safe, free-choice and acyclic nets (SAFC nets) or even occurrence nets and policies are defined by a constant deactivation pattern. We obtain complexity-theoretic results for such cases via a close connection to Bayesian networks, in particular we show that for SAFC nets the question whether there is a policy guaranteeing a reward above a certain threshold is $\mathsf{NP}^\mathsf{PP}$-complete. We also introduce a partial-order procedure which uses an SMT solver to address this problem.
翻译:我们引入了随机决策Petri网(SDPNs),它们是一种具有奖励和通过停用可控变迁的控制机制的随机Petri网。这种网可以被转化为马尔可夫决策过程(MDPs),由于并发性可能导致状态数量的组合爆炸。因此,我们将自己限制在实例的范围内,其中网是安全、自由选择和无环网(SAFC网),甚至是发生网,策略由恒定的停用模式定义。我们通过与贝叶斯网络的紧密联系获得了这种情况的复杂性理论结果,特别是我们表明,对于SAFC网,问题是否有一种策略可以保证获得高于某个阈值的奖励是$\mathsf{NP}^\mathsf{PP}$-完全的。我们还介绍了一种部分顺序过程,它使用SMT求解器来解决此问题。