Reinforcement learning is a machine learning approach based on behavioral psychology. It is focused on learning agents that can acquire knowledge and learn to carry out new tasks by interacting with the environment. However, a problem occurs when reinforcement learning is used in critical contexts where the users of the system need to have more information and reliability for the actions executed by an agent. In this regard, explainable reinforcement learning seeks to provide to an agent in training with methods in order to explain its behavior in such a way that users with no experience in machine learning could understand the agent's behavior. One of these is the memory-based explainable reinforcement learning method that is used to compute probabilities of success for each state-action pair using an episodic memory. In this work, we propose to make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks that need to be first addressed to solve a more complex task. The end goal is to verify if it is possible to provide to the agent the ability to explain its actions in the global task as well as in the sub-tasks. The results obtained showed that it is possible to use the memory-based method in hierarchical environments with high-level tasks and compute the probabilities of success to be used as a basis for explaining the agent's behavior.
翻译:强化学习是一种基于行为心理学的机械学习方法。 它侧重于能够获取知识和学会通过与环境互动执行新任务的学习代理机构。 但是,当系统用户在关键情况下使用强化学习时出现一个问题,因为该系统的用户需要为代理机构所执行的行动获得更多的信息和可靠性。 在这方面,可解释的强化学习旨在向培训代理机构提供以方法解释其行为的方法,以便让在机器学习方面没有经验的用户能够理解代理机构的行为。其中之一是基于记忆的、可解释的强化学习方法,用来用一个偶发记忆来计算每个州行动配对的成功概率。在这项工作中,我们提议在一个由子任务组成的层次环境中使用基于记忆的可解释强化学习方法,这些子任务首先需要解决一项更为复杂的任务。最终的目标是核实是否能够向代理机构提供在全球任务中以及在子任务中解释其行动的能力。 所获得的结果表明,在高层次环境中,可以使用基于记忆的基于稳定的方法来解释以高等级环境中的高级任务和稳定性行为。