Path planning is an important topic in robotics. Recently, value iteration based deep learning models have achieved good performance such as Value Iteration Network(VIN). However, previous methods suffer from slow convergence and low accuracy on large maps, hence restricted in path planning for agents with complex kinematics such as legged robots. Therefore, we propose a new value iteration based path planning method called Capability Iteration Network(CIN). CIN utilizes sparse reward maps and encodes the capability of the agent with state-action transition probability, rather than a convolution kernel in previous models. Furthermore, two training methods including end-to-end training and training capability module alone are proposed, both of which speed up convergence greatly. Several path planning experiments in various scenarios, including on 2D, 3D grid world and real robots with different map sizes are conducted. The results demonstrate that CIN has higher accuracy, faster convergence, and lower sensitivity to random seed compared to previous VI-based models, hence more applicable for real robot path planning.
翻译:路径规划是机器人的一个重要议题。 最近,基于价值迭代的深层学习模型取得了良好的绩效,比如价值迭代网络(VIN) 。 但是,以往的方法在大型地图上出现缓慢的趋同和低精度,因此对具有复杂运动特征的物剂(例如脚动机器人)的路径规划受到限制。 因此,我们提出了一个新的基于价值的迭代路径规划方法,称为能力迭代网络(CIN)。 CIN使用稀有的奖励地图,并编码了具有州-行动过渡概率的物剂的能力,而不是以前的模型中的熔岩内核。此外,还提出了两种培训方法,包括端到端的培训和训练能力模块,两者都大大加快了趋同速度。在2D、3D网格世界和地图大小不同的实际机器人等各种情景中进行了若干路径规划实验。结果显示,CIN的精确性、趋同速度和随机种子的敏感性比以前的六基模型要高,因此更适用于真正的机器人路径规划。