For real-world deployments, it is critical to allow robots to navigate in complex environments autonomously. Traditional methods usually maintain an internal map of the environment, and then design several simple rules, in conjunction with a localization and planning approach, to navigate through the internal map. These approaches often involve a variety of assumptions and prior knowledge. In contrast, recent reinforcement learning (RL) methods can provide a model-free, self-learning mechanism as the robot interacts with an initially unknown environment, but are expensive to deploy in real-world scenarios due to inefficient exploration. In this paper, we focus on efficient navigation with the RL technique and combine the advantages of these two kinds of methods into a rule-based RL (RuRL) algorithm for reducing the sample complexity and cost of time. First, we use the rule of wall-following to generate a closed-loop trajectory. Second, we employ a reduction rule to shrink the trajectory, which in turn effectively reduces the redundant exploration space. Besides, we give the detailed theoretical guarantee that the optimal navigation path is still in the reduced space. Third, in the reduced space, we utilize the Pledge rule to guide the exploration strategy for accelerating the RL process at the early stage. Experiments conducted on real robot navigation problems in hex-grid environments demonstrate that RuRL can achieve improved navigation performance.
翻译:对于现实世界的部署而言,允许机器人在复杂的环境中自主航行至关重要。 传统方法通常保持环境内部地图,然后设计若干简单的规则,同时采用本地化和规划方法,通过内部地图进行导航。 这些方法往往涉及各种假设和先前知识。 相比之下,最近的强化学习方法可以提供一个无模型的自学机制,因为机器人与最初未知的环境相互作用,但由于探索效率低下,在现实世界情景中部署成本昂贵。 在本文中,我们侧重于与RL技术高效航行,并将这两种方法的优势结合到基于规则的RL(RuRL)算法中,以减少样本复杂性和时间成本。 首先,我们使用跟踪墙规则来生成闭路轨道轨迹。 其次,我们采用减少规则来缩小轨道,这反过来有效地减少了多余的探索空间。 此外,我们从理论上详细保证最佳导航路径仍然在缩小空间。 第三,在缩小的空间中,我们利用保证规则将这两种方法的优势纳入基于规则的RL(RuL)算法,以降低样本的复杂性和时间成本。 首先,我们使用“追踪”规则来指导“快速导航”的探索战略,以加速推进RIrigridal导航过程。