连续域域作为规划的象征性行动 (Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning)

from arxiv, Accepted for publication at the 6th Conference on Robot Learning (CoRL) 2022, Auckland, New Zealand. Project website (including video) is available at https://seads.is.tue.mpg.de/

Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents. In this paper we introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic discrete abstraction of the environment's state for planning. We term our agent SEADS for Symbolic Effect-Aware Diverse Skills. We formulate an objective and corresponding algorithm which leads to unsupervised learning of a diverse set of skills through intrinsic motivation given a known state abstraction. The skills are jointly learned with the symbolic forward model which captures the effect of skill execution in the state abstraction. After training, we can leverage the skills as symbolic actions using the forward model for long-horizon planning and subsequently execute the plan using the learned continuous-action control skills. The proposed algorithm learns skills and forward models that can be used to solve complex tasks which require both continuous control and long-horizon planning capabilities with high success rate. It compares favorably with other flat and hierarchical reinforcement learning baseline agents and is successfully demonstrated with a real robot.

翻译：需要长期横向规划和连续控制能力的问题对现有的强化学习剂构成重大挑战。在本文件中,我们引入了一种新的等级强化学习剂,将时间上扩展的连续控制技能与前方模型联系起来,在环境状态的象征性离散抽取中进行规划。我们将我们的代理SEADS 用于符号效应-软件多样化技能。我们制定了一个客观和相应的算法,通过已知的状态抽取,通过内在动机,导致不受监督地学习多种技能。这些技能与象征式前方模型共同学习,该模型捕捉到在州抽象中执行技能的效果。经过培训,我们可以利用前方模型进行长期横向规划,然后利用学习的连续操作控制技能来执行计划。拟议的算法学习技能和前方模型,这些技能和前方模型可以用来解决复杂的任务,既需要持续控制,也需要长期和长视距规划能力,而且成功率很高。与其他平坦的强化学习基线剂相比,这些技能与其他平坦和分级基准代理成功演示。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日