As reinforcement learning methods increasingly amass accomplishments, the need for comprehending their solutions becomes more crucial. Most explainable reinforcement learning (XRL) methods generate a static explanation depicting their developers' intuition of what should be explained and how. In contrast, literature from the social sciences proposes that meaningful explanations are structured as a dialog between the explainer and the explainee, suggesting a more active role for the user and her communication with the agent. In this paper, we present ASQ-IT -- an interactive tool that presents video clips of the agent acting in its environment based on queries given by the user that describe temporal properties of behaviors of interest. Our approach is based on formal methods: queries in ASQ-IT's user interface map to a fragment of Linear Temporal Logic over finite traces (LTLf), which we developed, and our algorithm for query processing is based on automata theory. User studies show that end-users can understand and formulate queries in ASQ-IT, and that using ASQ-IT assists users in identifying faulty agent behaviors.
翻译:随着强化学习方法日益积累,理解其解决方案的需要变得更加重要。大多数可以解释的强化学习(XRL)方法都产生静态的解释,描述其开发者的直觉,说明什么应该解释以及如何解释。相比之下,社会科学的文献建议,有意义的解释是作为解释者和解释者之间的一个对话来构建的,建议用户发挥更积极的作用,以及她与代理商的沟通。在本文中,我们介绍了ASQ-IT -- -- 一个互动工具,根据用户提供的描述感兴趣行为的时间特性的询问,提供代理商在其环境中行事的视频剪辑。我们的方法基于正式的方法:在ASQ-IT用户的用户界面图中查询关于有限痕迹的线性时空逻辑(LTLf)的碎片,这是我们开发的,我们的查询处理方法以自动数据理论为基础。用户研究表明,终端用户可以理解并拟订ASQ-IT的查询,使用ASQ-IT协助用户识别错误代理商的行为。