具有语义模型的学习和规划模式 (Learning and Planning with a Semantic Model)

Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI. This paper describes progresses on this challenge in the context of man-made environments, which are visually diverse but contain intrinsic semantic regularities. We propose a hybrid model-based and model-free approach, LEArning and Planning with Semantics (LEAPS), consisting of a multi-target sub-policy that acts on visual inputs, and a Bayesian model over semantic structures. When placed in an unseen environment, the agent plans with the semantic model to make high-level decisions, proposes the next sub-target for the sub-policy to execute, and updates the semantic model based on new observations. We perform experiments in visual navigation tasks using House3D, a 3D environment that contains diverse human-designed indoor scenes with real-world objects. LEAPS outperforms strong baselines that do not explicitly plan using the semantic content.

翻译：对大赦国际来说,建设能够概括和适应看不见环境的深层强化学习剂仍然是一项根本性挑战。本文件描述了在人造环境中在这一挑战上取得的进展,这种环境具有视觉多样性,但含有内在的语义规律。我们建议采用基于模型和不使用模型的混合方法,即用语义(LEAPS)进行语言和规划(LEAPS),由多目标次级政策组成,该次级政策针对视觉投入采取行动,并针对语义结构建立一种巴耶斯式模式。在被置于一个看不见环境中时,该代理计划与语义模型一起作出高级别决定,提出下一个次级政策执行次级目标,并更新基于新观测的语义模型。我们用Hous3D进行视觉导航任务实验,Hous3D是一个包含各种人类设计的室内环境,带有现实世界物体。LEAPS超越了没有明确计划使用语义内容的强大基线。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

33+阅读 · 2019年10月18日