An open research question in deep reinforcement learning is how to focus the policy learning of key decisions within a sparse domain. This paper emphasizes combining the advantages of inputoutput hidden Markov models and reinforcement learning towards interpretable maintenance decisions. We propose a novel hierarchical-modeling methodology that, at a high level, detects and interprets the root cause of a failure as well as the health degradation of the turbofan engine, while, at a low level, it provides the optimal replacement policy. It outperforms the baseline performance of deep reinforcement learning methods applied directly to the raw data or when using a hidden Markov model without such a specialized hierarchy. It also provides comparable performance to prior work, however, with the additional benefit of interpretability.
 翻译:深入强化学习的一个公开研究问题是,如何将关键决策的政策学习集中在一个稀少的领域。本文强调将投入输出隐藏的马尔科夫模型的优势与强化学习的优势结合起来,以便做出可解释的维护决定。我们提出了一种新的等级模型方法,在高层次上检测和解释故障的根源以及涡轮发动机的健康退化,而在低层次上则提供最佳的替代政策。它优于直接适用于原始数据的深度强化学习方法的基线性能或在没有这种专门等级的情况下使用隐藏的马尔科夫模型时的基线性能。它也提供了与以往工作的类似性能,但还有可解释性的额外好处。