Goal-conditioned reinforcement learning (GCRL), related to a set of complex RL problems, trains an agent to achieve different goals under particular scenarios. Compared to the standard RL solutions that learn a policy solely depending on the states or observations, GCRL additionally requires the agent to make decisions according to different goals. In this survey, we provide a comprehensive overview of the challenges and algorithms for GCRL. Firstly, we answer what the basic problems are studied in this field. Then, we explain how goals are represented and present how existing solutions are designed from different points of view. Finally, we make the conclusion and discuss potential future prospects that recent researches focus on.
翻译:强化学习(GCRL)与一系列复杂的RL问题相关,目标条件强化学习(GCRL)与一系列复杂的RL问题相关,培训一个代理机构在特定情况下实现不同目标。与标准RL解决方案相比,该解决方案仅根据州或意见学习一项政策,全球CRL还要求该代理机构根据不同目标作出决定。在这次调查中,我们全面概述了全球CRL的挑战和算法。首先,我们回答该领域研究的基本问题。然后,我们解释目标的体现方式,并介绍现有解决方案是如何从不同角度设计的。最后,我们得出结论,并讨论近期研究重点关注的潜在未来前景。