In the last decade, deep learning has achieved great success in machine learning tasks where the input data is represented with different levels of abstractions. Driven by the recent research in reinforcement learning using deep neural networks, we explore the feasibility of designing a learning model based on expert behaviour for complex, multidimensional tasks where reward function is not available. We propose a novel method for apprenticeship learning based on the previous research on supervised learning techniques in reinforcement learning. Our method is applied to video frames from Atari games in order to teach an artificial agent to play those games. Even though the reported results are not comparable with the state-of-the-art results in reinforcement learning, we demonstrate that such an approach has the potential to achieve strong performance in the future and is worthwhile for further research.
翻译:在过去十年中,深层学习在机器学习任务方面取得了巨大成功,输入数据包含不同程度的抽象数据。在利用深层神经网络加强学习的最近研究的推动下,我们探索了设计一种基于专家行为的学习模式的可行性,该模式基于没有奖励功能的复杂、多层面任务。我们基于以前对强化学习中监督学习技术的研究,提出了一种新的学徒学习方法。我们的方法被应用到Atari游戏的视频框中,以便教授一个人造代理人来玩这些游戏。尽管所报告的结果与强化学习的最新结果不相匹配,但我们证明,这样一种方法有可能在未来取得强有力的业绩,值得进一步研究。