通过结构化世界模型进行好奇勘探 (Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation)

It has been a long-standing dream to design artificial agents that explore their environment efficiently via intrinsic motivation, similar to how children perform curious free play. Despite recent advances in intrinsically motivated reinforcement learning (RL), sample-efficient exploration in object manipulation scenarios remains a significant challenge as most of the relevant information lies in the sparse agent-object and object-object interactions. In this paper, we propose to use structured world models to incorporate relational inductive biases in the control loop to achieve sample-efficient and interaction-rich exploration in compositional multi-object environments. By planning for future novelty inside structured world models, our method generates free-play behavior that starts to interact with objects early on and develops more complex behavior over time. Instead of using models only to compute intrinsic rewards, as commonly done, our method showcases that the self-reinforcing cycle between good models and good exploration also opens up another avenue: zero-shot generalization to downstream tasks via model-based planning. After the entirely intrinsic task-agnostic exploration phase, our method solves challenging downstream tasks such as stacking, flipping, pick & place, and throwing that generalizes to unseen numbers and arrangements of objects without any additional training.

翻译：设计人造物剂,通过内在动机有效探索环境是一个长期的梦想,类似于儿童如何进行好奇的自由游戏。尽管在本质上有动机的强化学习(RL)方面最近有所进步,但是对物体操纵情景的抽样有效探索仍是一个重大挑战,因为大多数相关信息都存在于稀疏的物剂-对象和对象-对象相互作用中。在本文中,我们提议使用结构化的世界模型,在控制循环中引入关联感性诱导偏见,以在组合式多对象环境中实现抽样高效和互动丰富的探索。通过规划未来结构化世界模型中的新颖性,我们的方法产生自由玩耍行为,开始在早期与对象互动,并随着时间的推移形成更复杂的行为。我们的方法没有使用模型来仅仅计算内在的奖赏,而是像通常所做的那样,我们的方法表明,良好模型和良好探索之间的自我强化循环也开辟了另一个途径:通过模型规划将零光化的概括化到下游任务。在完全固有的任务-不可知性探索阶段之后,我们的方法解决了挑战下游任务的任务,例如堆叠、翻转、摘取和将更多的物体推向任何看不见的数字和训练安排。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日