With deep reinforcement learning (RL) systems like autonomous driving being wildly deployed but remaining largely opaque, developers frequently use explainable RL (XRL) tools to better understand and work with deep RL agents. However, previous XRL works employ a techno-centric research approach, ignoring how RL developers perceive the generated explanations. Through a pilot study, we identify major goals for RL practitioners to use XRL methods and four pitfalls that widen the gap between existing XRL methods and these goals. The pitfalls include inaccessible reasoning processes, inconsistent or unintelligible explanations, and explanations that cannot be generalized. To fill the discovered gap, we propose a counterfactual-inference-based explanation method that discovers the details of the reasoning process of RL agents and generates natural language explanations. Surrounding this method, we build an interactive XRL system where users can actively explore explanations and influential information. In a user study with 14 participants, we validated that developers identified 20.9% more abnormal behaviors and limitations of RL agents with our system compared to the baseline method, and using our system helped end users improve their performance in actionability tests by 25.1% in an auto-driving task and by 16.9% in a StarCraft II micromanagement task.
翻译:深度强化学习(RL)系统,如自主驾驶等自主驾驶系统被疯狂地部署,但基本上仍然不透明,开发者经常使用可解释的RL(XRL)工具来更好地了解和与深层RL代理商合作。然而,以前的XRL工作采用以技术为中心的研究方法,忽略了RL开发者如何看待所产生的解释。我们通过试点研究,为RL从业人员确定了使用XRL方法的主要目标,以及扩大现有XRL方法与这些目标之间差距的4个陷阱。缺陷包括无法获取的推理程序、不一致或无法理解的解释以及无法概括的解释。为了填补所发现的空白,我们建议采用反事实推断解释方法,以发现RL代理商的推理过程细节并产生自然语言解释。我们通过这一方法建立了一个互动式的XRL系统,用户可以在其中积极探索解释和有影响力的信息。在一项用户研究中,我们验证了开发者发现,与基线方法相比,RL代理商与我们的系统相比,有20.9%的异常行为和局限性,以及无法概括的解释。为了填补这一空白,我们系统终端用户帮助改进了其在II任务中的自动管理任务中的性任务中的16.9%。