通过技能多样化利用近近象征性示范模式加强学习 (Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity)

Creating reinforcement learning (RL) agents that are capable of accepting and leveraging task-specific knowledge from humans has been long identified as a possible strategy for developing scalable approaches for solving long-horizon problems. While previous works have looked at the possibility of using symbolic models along with RL approaches, they tend to assume that the high-level action models are executable at low level and the fluents can exclusively characterize all desirable MDP states. This need not be true and this assumption overlooks one of the central technical challenges of incorporating symbolic task knowledge, namely, that these symbolic models are going to be an incomplete representation of the underlying task. To this end, we introduce Symbolic-Model Guided Reinforcement Learning, wherein we will formalize the relationship between the symbolic model and the underlying MDP that will allow us to capture the incompleteness of the symbolic model. We will use these models to extract high-level landmarks that will be used to decompose the task, and at the low level, we learn a set of diverse policies for each possible task sub-goal identified by the landmark. We evaluate our system by testing on three different benchmark domains and we show how even with incomplete symbolic model information, our approach is able to discover the task structure and efficiently guide the RL agent towards the goal.

翻译：建立能够接受和利用人类特定任务知识的强化学习(RL)代理机构,这些代理机构能够接受和利用人类特定任务知识,长期以来被确定为一种可能的战略,以制定可扩展的方法解决长期横向问题。虽然以前的工作研究过使用象征性模型和RL方法的可能性,但它们往往认为,高级别行动模型可以在低层次上实施,流利者可以专门描述所有理想的MDP国家。这不一定是真实的,这一假设忽略了纳入象征性任务知识的一个核心技术挑战,即这些象征性模型将不完全代表基本任务。为此,我们引入了“符号-模式引导强化学习”系统,我们将在其中正式确定象征性模型和基本MDP之间的关系,使我们能够捕捉象征性模型的不完善性。我们将利用这些模型来提取高层次的里程碑,用来消除任务,在低层次上,我们为里程碑确定的每一项可能的任务次级目标学习了一套不同的政策。我们通过测试三个不同的基准领域来评估我们的系统,我们用“符号-模块-引导强化学习,我们用“符号-引导”模型来展示能够实现的“指标性任务”模型,我们甚至用不完全的“指标性模型来发现“目标”。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日