人类速度:深强化学习与行动拖延 (At Human Speed: Deep Reinforcement Learning with Action Delay)

There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of tasks, from video games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning and reinforcement learning, that learn to play from experience with minimal prior knowledge. However, these machines often do not win through intelligence alone -- they possess vastly superior speed and precision, allowing them to act in ways a human never could. To level the playing field, we restrict the machine's reaction time to a human level, and find that standard deep reinforcement learning methods quickly drop in performance. We propose a solution to the action delay problem inspired by human perception -- to endow agents with a neural predictive model of the environment which "undoes" the delay inherent in their environment -- and demonstrate its efficacy against professional players in Super Smash Bros. Melee, a popular console fighting game.

翻译：玩游戏的人工智能能力最近出现爆炸。许多任务种类,从电子游戏到运动控制到游戏控制,现在都可以通过基于深层次学习和强化学习的相当通用的算法来解脱,这些算法可以从经验中学习,而这种算法往往不能仅仅靠智能来取胜 -- -- 它们拥有极高的速度和精确度,从而能够以人类永远无法做到的方式行事。为了让游戏场平平平,我们把机器的反应时间限制在人的水平上,发现标准的深层强化学习方法迅速下降。我们提出了一个解决方案,解决由人类感知引起的行动拖延问题 -- -- 给代理商提供一种神经预测模型,说明环境所固有的延迟是“不发生”的 -- -- 并展示其对超级Smash Bros公司专业参与者的功效。Meleele,一个受欢迎的控制台格斗游戏。