We introduce a novel approach to dynamic obstacle avoidance based on Deep Reinforcement Learning by defining a traffic type independent environment with variable complexity. Filling a gap in the current literature, we thoroughly investigate the effect of missing velocity information on an agent's performance in obstacle avoidance tasks. This is a crucial issue in practice since several sensors yield only positional information of objects or vehicles. We evaluate frequently-applied approaches in scenarios of partial observability, namely the incorporation of recurrency in the deep neural networks and simple frame-stacking. For our analysis, we rely on state-of-the-art model-free deep RL algorithms. The lack of velocity information is found to significantly impact the performance of an agent. Both approaches - recurrency and frame-stacking - cannot consistently replace missing velocity information in the observation space. However, in simplified scenarios, they can significantly boost performance and stabilize the overall training procedure.
翻译:在深强化学习的基础上,我们通过界定交通类型独立且复杂程度不一的环境,对动态障碍的避免采用一种新颖办法。填补当前文献中的一个空白,我们彻底调查缺少速度信息对代理人在避免障碍任务方面表现的影响。这是实践中的一个关键问题,因为几个传感器只提供物体或车辆的定位信息。我们评估了在部分可观察性情况下经常采用的方法,即将货币重新货币纳入深层神经网络和简单的框架拆解。我们的分析是,我们依赖最先进的无型深RL模型算法。缺乏速度信息对代理人的性能产生了重大影响。两种方法――重制和框架拆解――都无法始终取代观测空间中缺失的速度信息。但在简化的假设中,它们能够大大提高性能并稳定总体培训程序。