Optimal motion planning involves obstacles avoidance where path planning is the key to success in optimal motion planning. Due to the computational demands, most of the path planning algorithms can not be employed for real-time based applications. Model-based reinforcement learning approaches for path planning have received certain success in the recent past. Yet, most of such approaches do not have deterministic output due to the randomness. We analyzed several types of reinforcement learning-based approaches for path planning. One of them is a deterministic tree-based approach and other two approaches are based on Q-learning and approximate policy gradient, respectively. We tested preceding approaches on two different simulators, each of which consists of a set of random obstacles that can be changed or moved dynamically. After analysing the result and computation time, we concluded that the deterministic tree search approach provides highly stable result. However, the computational time is considerably higher than the other two approaches. Finally, the comparative results are provided in terms of accuracy and computational time as evidence.
翻译:最佳运动规划涉及避免障碍,因为道路规划是最佳运动规划成功的关键。由于计算要求,大多数路径规划算法无法用于实时应用程序。基于模型的路径规划强化学习方法最近取得了一定的成功。然而,由于随机性,大多数此类方法没有确定性产出。我们分析了几种基于强化学习的道路规划方法。其中一种是基于树的确定性方法,其他两种方法分别以Q-学习和近似政策梯度为基础。我们用两种不同的模拟器测试了之前的方法,其中每一种都是一组随机障碍,可以动态地改变或移动。在分析结果和计算时间后,我们得出结论,确定性树搜索方法提供了非常稳定的结果。然而,计算时间比其他两种方法要高得多。最后,比较结果以准确性和计算时间作为证据提供。