外部影响代理人概念框架:协助加强学习审查 (A Conceptual Framework for Externally-influenced Agents: An Assisted Reinforcement Learning Review)

A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, while reviewing externally-influenced methods, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering collaboration by classifying and comparing various methods that use external information in the learning process. The proposed taxonomy details the relationship between the external information source and the learner agent, highlighting the process of information decomposition, structure, retention, and how it can be used to influence agent learning. As well as reviewing state-of-the-art methods, we identify current streams of reinforcement learning that use external information in order to improve the agent's performance and its decision-making process. These include heuristic reinforcement learning, interactive reinforcement learning, learning from demonstration, transfer learning, and learning from multiple sources, among others. These streams of reinforcement learning operate with the shared objective of scaffolding the learner agent. Lastly, we discuss further possibilities for future work in the field of assisted reinforcement learning systems.

翻译：强化学习代理人的长期目标是能够在复杂的现实世界情景下执行任务。使用外部信息是扩大促进因素,解决更复杂问题的一种方式。然而,使用外部信息的不同方法之间普遍缺乏协作或互操作性。在这项工作中,在审查外部影响的方法的同时,我们提出一个概念框架和分类法,用于辅助强化学习,目的是通过对学习过程中使用外部信息的各种方法进行分类和比较,促进协作。拟议的分类法详细介绍了外部信息来源与学习代理人之间的关系,强调了信息分解、结构、保留和如何利用这些信息影响代理人学习的过程。除了审查最新的方法外,我们还查明了目前利用外部信息改进代理人的绩效及其决策过程的强化学习流。其中包括超常强化学习、交互式强化学习、从演示中学习、转移学习和从多种来源学习。这些强化学习的流与学习代理人的共同目标一起运作,突出了信息分解过程、结构、保留和如何利用它来影响代理人的学习。最后,我们在审查最新的方法时,我们查明了使用外部信息来改进代理人的学习过程及其决策过程。其中包括超常强化学习学习、交互强化学习、从演示中学习、转移学习和从多重来源学习等等。这些强化学习的流与学习者的共同目标一起运作,我们讨论进一步学习如何加强未来工作的外地工作的可能性。