Multi-agent robotic systems are increasingly operating in real-world environments in close proximity to humans, yet are largely controlled by policy models with inscrutable deep neural network representations. We introduce a method for incorporating interpretable concepts from a domain expert into models trained through multi-agent reinforcement learning, by requiring the model to first predict such concepts then utilize them for decision making. This allows an expert to both reason about the resulting concept policy models in terms of these high-level concepts at run-time, as well as intervene and correct mispredictions to improve performance. We show that this yields improved interpretability and training stability, with benefits to policy performance and sample efficiency in a simulated and real-world cooperative-competitive multi-agent game.
翻译:多试剂机器人系统越来越多地在接近人类的现实世界环境中运作,但基本上受政策模式控制,其神经网络表现是不可捉摸的深度神经网络。我们引入了一种方法,将领域专家的可解释概念纳入通过多试剂强化学习培训的模式,要求模型首先预测这些概念,然后将其用于决策。这让专家能够从运行时这些高层次概念的角度来解释由此产生的概念政策模式,并干预和纠正错误,以改进绩效。我们表明,这提高了可解释性和培训稳定性,有利于模拟和现实世界合作竞争多试剂游戏的政策性能和抽样效率。