Advances in multi-agent reinforcement learning(MARL) enable sequential decision making for a range of exciting multi-agent applications such as cooperative AI and autonomous driving. Explaining agent decisions are crucial for improving system transparency, increasing user satisfaction, and facilitating human-agent collaboration. However, existing works on explainable reinforcement learning mostly focus on the single-agent setting and are not suitable for addressing challenges posed by multi-agent environments. We present novel methods to generate two types of policy explanations for MARL: (i) policy summarization about the agent cooperation and task sequence, and (ii) language explanations to answer queries about agent behavior. Experimental results on three MARL domains demonstrate the scalability of our methods. A user study shows that the generated explanations significantly improve user performance and increase subjective ratings on metrics such as user satisfaction.
翻译:多试剂强化学习(MARL)的进展使一系列令人振奋的多试剂应用,如合作性AI和自主驱动等,能够连续决策,解释代理决定对于提高系统透明度、提高用户满意度和促进人类代理合作至关重要,但是,现有的可解释的强化学习工作主要侧重于单一试剂环境,不适合应对多试剂环境带来的挑战。我们提出了为MARL产生两种政策解释的新方法:(一) 有关代理合作和任务序列的政策总结,以及(二) 用于回答代理行为询问的语言解释。三个MARL域的实验结果显示了我们方法的可扩展性。用户研究表明,所产生的解释大大改进了用户的绩效,提高了用户满意度等标准的主观评级。