In robot navigation, generalizing quickly to unseen environments is essential. Hierarchical methods inspired by human navigation have been proposed, typically consisting of a high-level landmark proposer and a low-level controller. However, these methods either require precise high-level information to be given in advance or need to construct such guidance from extensive interaction with the environment. In this work, we propose an approach that leverages a rough 2-D map of the environment to navigate in novel environments without requiring further learning. In particular, we introduce a dynamic topological map that can be initialized from the rough 2-D map along with a high-level planning approach for proposing reachable 2-D map patches of the intermediate landmarks between the start and goal locations. To use proposed 2-D patches, we train a deep generative model to generate intermediate landmarks in observation space which are used as subgoals by low-level goal-conditioned reinforcement learning. Importantly, because the low-level controller is only trained with local behaviors (e.g. go across the intersection, turn left at a corner) on existing environments, this framework allows us to generalize to novel environments given only a rough 2-D map, without requiring further learning. Experimental results demonstrate the effectiveness of the proposed framework in both seen and novel environments.
翻译:在机器人导航中,必须快速向看不见的环境普及。 已经提出了由人类导航启发的等级式方法, 通常包括一个高层次的里程碑式提议者和低级别的控制器。 但是, 这些方法要么需要事先提供精确的高层次信息, 要么需要从与环境的广泛互动中构建这样的指导。 在这项工作中, 我们提出一种方法, 利用环境的粗略二维地图在新环境中导航, 而不需要进一步学习。 特别是, 我们引入了动态的地形图, 该图可以从粗略的二维地图开始, 以及一个高层次的规划方法, 以提出在起始点和目标点与目标点之间的中间里程碑间可达的二维地图补块。 要使用2D补丁, 我们训练一个深层次的基因化模型, 以产生观测空间的中间标志性标点, 用于低层次的有目标限制的强化学习。 重要的是, 低层次的控制器仅受过当地行为的培训( 例如跨过交叉点, 转左角), 这个框架使我们能够在现有的环境上概括地貌, 。 这个框架让我们在只看到粗略的实验2D 框架 。