Cyber-physical systems (CPSs) are naturally modelled as reactive systems with nondeterministic and probabilistic dynamics. Model-based verification techniques have proved effective in the deployment of safety-critical CPSs. Central for a successful application of such techniques is the construction of an accurate formal model for the system. Manual construction can be a resource-demanding and error-prone process, thus motivating the design of automata learning algorithms to synthesise a system model from observed system behaviours. This paper revisits and adapts the classic Baum-Welch algorithm for learning Markov decision processes and Markov chains. For the case of MDPs, which typically demand more observations, we present a model-based active learning sampling strategy that choses examples which are most informative w.r.t.\ the current model hypothesis. We empirically compare our approach with state-of-the-art tools and demonstrate that the proposed active learning procedure can significantly reduce the number of observations required to obtain accurate models.
翻译:网络物理系统(CPS)自然是仿照非确定性和概率动态反应系统的模式。基于模型的核查技术已证明在部署安全临界的CPS方面行之有效。成功应用这类技术的核心是为系统建造一个准确的正式模型。手工构建可以是一个资源需求性和易出错的过程,从而鼓励设计自动数据学习算法,以综合观察到的系统行为模式。本文回顾并修改典型的 Baum-Welch 算法,用于学习Markov 决策程序和Markov 链。对于通常需要更多观察的 MDP 来说,我们提出了一个基于模型的积极学习抽样战略,选择了信息最丰富的 w.r.t.\ 目前的模型假设。我们用经验将我们的方法与最先进的工具进行比较,并表明拟议的积极学习程序可以大大减少获取准确模型所需的观测次数。