The function approximators employed by traditional image based Deep Reinforcement Learning (DRL) algorithms usually lack a temporal learning component and instead focus on learning the spatial component. We propose a technique wherein both temporal as well as spatial components are jointly learned. Our tested was tested with a generic DQN and it outperformed it in terms of maximum rewards as well as sample complexity. This algorithm has implications in the robotics as well as sequential decision making domains.
翻译:传统图像基于深强化学习(DRL)算法所使用的功能近似方程式通常缺乏时间学习部分,而是侧重于学习空间部分。我们提出了一种方法,即同时学习时间和空间部分。我们测试的测试使用通用的 DQN 测试,在最大回报和样本复杂性方面优于它。这一算法对机器人以及连续决策领域都有影响。