Deep reinforcement learning (RL) methods often require many trials before convergence, and no direct interpretability of trained policies is provided. In order to achieve fast convergence and interpretability for the policy in RL, we propose a novel RL method for text-based games with a recent neuro-symbolic framework called Logical Neural Network, which can learn symbolic and interpretable rules in their differentiable network. The method is first to extract first-order logical facts from text observation and external word meaning network (ConceptNet), then train a policy in the network with directly interpretable logical operators. Our experimental results show RL training with the proposed method converges significantly faster than other state-of-the-art neuro-symbolic methods in a TextWorld benchmark.
翻译:深度强化学习(RL)方法通常需要多次试验才能趋同,而且没有提供受过训练的政策的直接解释。为了实现RL政策快速趋同和可解释性,我们提议了一种新型的RL方法,用于基于文字的游戏,最近的神经-共振框架称为逻辑神经网络,可以在其不同的网络中学习象征性和可解释的规则。这个方法首先从文本观察和外部文字含义网络(ConceptNet)中提取第一阶逻辑事实,然后在网络中培训一项政策,由直接可解释的逻辑操作者进行。我们的实验结果显示,RL培训与拟议方法的结合比文本世界基准中其他最先进的神经共振学方法要快得多。