High-dimensional policies, such as those represented by neural networks, cannot be reasonably interpreted by humans. This lack of interpretability reduces the trust users have in policy behavior, limiting their use to low-impact tasks such as video games. Unfortunately, many methods rely on neural network representations for effective learning. In this work, we propose a method to build predictable policy trees as surrogates for policies such as neural networks. The policy trees are easily human interpretable and provide quantitative predictions of future behavior. We demonstrate the performance of this approach on several simulated tasks.
翻译:人类无法合理地解释高层次政策,如神经网络所代表的高层次政策。这种缺乏可解释性的做法减少了用户对政策行为的信任,从而降低了用户对政策行为的信任度,限制了对政策行为的信任度,限制了对政策行为的信任度,限制了对诸如电子游戏等低影响任务的使用。不幸的是,许多方法都依靠神经网络代表来进行有效的学习。在这项工作中,我们提出了一个方法来建立可预测的政策树作为神经网络等政策的代名词。政策树很容易被人类解释,并且提供了对未来行为的量化预测。我们展示了这一方法在几项模拟任务上的表现。