Conventional deep reinforcement learning methods are sample-inefficient and usually require a large number of training trials before convergence. Since such methods operate on an unconstrained action set, they can lead to useless actions. A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic. The LNNs functions as an end-to-end differentiable network that minimizes a novel contradiction loss to learn interpretable rules. In this paper, we utilize LNNs to define an inference graph using basic logical operations, such as AND and NOT, for faster convergence in reinforcement learning. Specifically, we propose an integrated method that enables model-free reinforcement learning from external knowledge sources in an LNNs-based logical constrained framework such as action shielding and guide. Our results empirically demonstrate that our method converges faster compared to a model-free reinforcement learning method that doesn't have such logical constraints.
翻译:常规强化深层学习方法的样本效率不高,通常需要大量培训试验才能趋同。由于这种方法在不受限制的一组行动上运作,可能导致无用的行动。最近一个称为逻辑神经网络(LNN)的神经共振框架可以同时提供神经网络的关键特性和象征性逻辑。LNN作为端到端的不同网络,最大限度地减少新的矛盾损失,以学习可解释的规则。在本文中,我们利用LNNS用基本的逻辑操作(如和不)来定义推论图,以便更快地结合强化学习。具体地说,我们提出了一个综合方法,以便能够在基于LNNS的逻辑约束框架(如行动屏蔽和指南)中,从外部知识来源进行无模型强化学习。我们的成果经验证明,我们的方法与没有逻辑限制的无模型强化学习方法相比,会更快地融合。