Deep Reinforcement Learning has proved to be able to solve many control tasks in different fields, but the behavior of these systems is not always as expected when deployed in real-world scenarios. This is mainly due to the lack of domain adaptation between simulated and real-world data together with the absence of distinction between train and test datasets. In this work, we investigate these problems in the autonomous driving field, especially for a maneuver planning module for roundabout insertions. In particular, we present a system based on multiple environments in which agents are trained simultaneously, evaluating the behavior of the model in different scenarios. Finally, we analyze techniques aimed at reducing the gap between simulated and real-world data showing that this increased the generalization capabilities of the system both on unseen and real-world scenarios.
翻译:深强化学习已证明能够在不同领域解决许多控制任务,但这些系统的行为在现实世界情景下部署时并不总是如预期的那样。这主要是因为模拟数据和现实世界数据之间缺乏领域适应,同时火车和测试数据集之间也没有区分。在这项工作中,我们调查了自主驱动领域存在的这些问题,特别是环形插入的机动规划模块。特别是,我们提出了一个基于多个环境的系统,在多个环境中同时培训物剂,评估模型在不同情景中的行为。最后,我们分析了旨在缩小模拟数据和现实世界数据之间差距的技术,显示这提高了系统在不可见和现实世界情景上的普及能力。