We focus on the task of creating a reinforcement learning agent that is inherently explainable -- with the ability to produce immediate local explanations by thinking out loud while performing a task and analyzing entire trajectories post-hoc to produce causal explanations. This Hierarchically Explainable Reinforcement Learning agent (HEX-RL), operates in Interactive Fictions, text-based game environments in which an agent perceives and acts upon the world using textual natural language. These games are usually structured as puzzles or quests with long-term dependencies in which an agent must complete a sequence of actions to succeed -- providing ideal environments in which to test an agent's ability to explain its actions. Our agent is designed to treat explainability as a first-class citizen, using an extracted symbolic knowledge graph-based state representation coupled with a Hierarchical Graph Attention mechanism that points to the facts in the internal graph representation that most influenced the choice of actions. Experiments show that this agent provides significantly improved explanations over strong baselines, as rated by human participants generally unfamiliar with the environment, while also matching state-of-the-art task performance.
翻译:我们侧重于创建一个具有内在解释性的强化学习代理机构 -- -- 有能力在执行任务时大声思考和分析整个轨迹以产生因果关系解释,从而产生直接的当地解释。这个等级可解释的强化学习代理机构(HEX-RL),在互动的编曲环境中运作,以文字为基础的游戏环境,代理机构在其中感知并运用自然语言对世界采取行动。这些游戏通常结构上具有长期依赖性,代理人必须完成一系列行动才能取得成功 -- -- 提供理想的环境,检验代理人解释其行动的能力。我们的代理机构的设计是将解释性作为一流公民对待,使用一个抽象的象征性的基于图的图形国家代表,同时使用一个直观的图形关注机制,指出内部图表代表中影响行动选择的事实。实验表明,该代理机构对于通常不熟悉环境的人类参与者所评定的强力基线提供显著改进的解释,同时匹配国家任务绩效。