In this paper, we introduce Latent Go-Explore (LGE), a simple and general approach based on the Go-Explore paradigm for exploration in reinforcement learning (RL). Go-Explore was initially introduced with a strong domain knowledge constraint for partitioning the state space into cells. However, in most real-world scenarios, drawing domain knowledge from raw observations is complex and tedious. If the cell partitioning is not informative enough, Go-Explore can completely fail to explore the environment. We argue that the Go-Explore approach can be generalized to any environment without domain knowledge and without cells by exploiting a learned latent representation. Thus, we show that LGE can be flexibly combined with any strategy for learning a latent representation. We show that LGE, although simpler than Go-Explore, is more robust and outperforms all state-of-the-art algorithms in terms of pure exploration on multiple hard-exploration environments. The LGE implementation is available as open-source at https://github.com/qgallouedec/lge.
翻译:在本文中,我们引入了基于Go-Explore探索强化学习模式(RL)的简单和一般方法LGE(LGE),这是一种基于Go-Explore探索的简单和通用方法。Go-Explore最初引入时,对将国家空间分割成细胞有很强的域知识限制。然而,在大多数现实世界的情景中,从原始观测中提取域知识是复杂和乏味的。如果细胞分割不够充分信息,Go-Explore(Go-Explore)完全无法探索环境。我们争辩说,Go-Explore方法可以通过利用学习过的潜代表形式,在没有域知识或细胞的情况下普及到任何环境中。因此,我们表明LGE可以灵活地与学习潜在代表的任何战略结合起来。我们表明,LGE尽管比Go-Explore简单,但在对多种硬勘探环境中的纯探索方面,其所有状态和尖端算法都更加强大和超越了。LGEO的算法。我们可以在https://githhubub.com/qggarloudedededec/lge。