Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simulators of complicated systems that can run sufficiently fast for deep RL to be applicable. We focus on domains where agents interact with a reduced portion of a larger environment while still being affected by the global dynamics. Our method combines the use of local simulators with learned models that mimic the influence of the global system. The experiments reveal that incorporating this idea into the deep RL workflow can considerably accelerate the training process and presents several opportunities for the future.
翻译:学习解决现实世界问题的有效政策仍然是加强学习领域的一个公开挑战。主要局限在于所需数据的数量和获得数据的速度。在本文件中,我们研究如何在复杂系统中建立轻量级模拟器,这些模拟器运行速度足够快,可以适用深层RL。我们侧重于代理器与较大环境的一小部分发生相互作用,同时仍然受到全球动态影响的领域。我们的方法是将当地模拟器的使用与模仿全球系统影响的学习模型结合起来。实验表明,将这一想法纳入深层RL工作流程可以大大加快培训进程并为未来带来若干机会。