The problem of autonomous racing is to navigate through a race course as quickly as possible while not colliding with any obstacles. We approach the autonomous racing problem with the added constraint of not maintaining an updated obstacle map of the environment. Several current approaches to this problem use end-to-end learning systems where an agent replaces the entire navigation pipeline. This paper presents a hierarchical planning architecture that combines a high level planner and path following system with a reinforcement learning agent that learns that subsystem of obstacle avoidance. The novel "modification planner" uses the path follower to track the global plan and the deep reinforcement learning agent to modify the references generated by the path follower to avoid obstacles. Importantly, our architecture does not require an updated obstacle map and only 10 laser range finders to avoid obstacles. The modification planner is evaluated in the context of F1/10th autonomous racing and compared to a end-to-end learning baseline, the Follow the Gap Method and an optimisation based planner. The results show that the modification planner can achieve faster average times compared to the baseline end-to-end planner and a 94% success rate which is similar to the baseline.
翻译:自动赛事的问题是在不与任何障碍相冲突的情况下,尽快通过赛跑路线。我们处理自动赛事问题时,还附加了不维持最新的环境障碍图的限制。目前对该问题采取的若干做法是使用代理取代整个导航管道的端到端学习系统。本文件展示了一个等级规划结构,将高层次规划员和跟踪路径系统与学习避免障碍子的强化学习剂结合起来。新颖的“校正规划员”利用跟踪路径跟踪全球计划,以及深层强化学习剂修改路径跟踪器生成的引用,以避免障碍。重要的是,我们的建筑不需要更新障碍图,而只有10个激光测距仪来避免障碍。修改计划员是在F1/10自主赛的背景下评价的,并与端到端学习基线、“差距方法”和基于优化的规划员进行比较。结果显示,修改计划员可以比基线端到端计划员更快的平均时间,94%的成功率与基线相似。