GEM:学习动态控制系统集体强化模式 (GEM: Group Enhanced Model for Learning Dynamical Control Systems)

Learning the dynamics of a physical system wherein an autonomous agent operates is an important task. Often these systems present apparent geometric structures. For instance, the trajectories of a robotic manipulator can be broken down into a collection of its transitional and rotational motions, fully characterized by the corresponding Lie groups and Lie algebras. In this work, we take advantage of these structures to build effective dynamical models that are amenable to sample-based learning. We hypothesize that learning the dynamics on a Lie algebra vector space is more effective than learning a direct state transition model. To verify this hypothesis, we introduce the Group Enhanced Model (GEM). GEMs significantly outperform conventional transition models on tasks of long-term prediction, planning, and model-based reinforcement learning across a diverse suite of standard continuous-control environments, including Walker, Hopper, Reacher, Half-Cheetah, Inverted Pendulums, Ant, and Humanoid. Furthermore, plugging GEM into existing state of the art systems enhances their performance, which we demonstrate on the PETS system. This work sheds light on a connection between learning of dynamics and Lie group properties, which opens doors for new research directions and practical applications along this direction. Our code is publicly available at: https://tinyurl.com/GEMMBRL.

翻译：学习自主代理操作的物理系统的动态是一项重要任务。这些系统通常显示明显的几何结构。例如,机器人操纵器的轨迹可以细分成一系列过渡和轮换运动,完全由相应的利伊小组和代数组成。在这项工作中,我们利用这些结构来建立有效的动态模型,便于抽样学习。我们假设,学习利代数矢量空间的动态比学习直接状态过渡模型更有效。为了核实这一假设,我们引入了集团强化模型(GEM)。GEM在长期预测、规划和基于模型的强化模型任务方面,大大超越了常规过渡模式,在一系列不同的标准连续控制环境中学习,包括Walker、Hopper、Lacer、Lix-Cheetah、Inverd Pentulums、Ant和Humanoal。此外,将GEM插入现有的艺术系统状态会提高它们的性能,我们在PETS系统上展示了这种性能。这项工作揭示了动态和LML的实用型号之间的连接,这是我们在MEL/ML的搜索方向上的实用型号。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【干货书-斯坦福】最优化算法，521页pdf，《Algorithms for Optimization》MIT出版社

专知会员服务

280+阅读 · 2020年7月2日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

55页图深度学习导论《A Gentle Introduction to Deep Learning for Graphs》

专知会员服务

104+阅读 · 2020年1月3日