通过世界模式发现和实现目标 (Discovering and Achieving Goals via World Models)

How can artificial agents learn to solve many diverse tasks in complex visual environments in the absence of any supervision? We decompose this question into two problems: discovering new goals and learning to reliably achieve them. We introduce Latent Explorer Achiever (LEXA), a unified solution to these that learns a world model from image inputs and uses it to train an explorer and an achiever policy from imagined rollouts. Unlike prior methods that explore by reaching previously visited states, the explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever to practice. After the unsupervised phase, LEXA solves tasks specified as goal images zero-shot without any additional learning. LEXA substantially outperforms previous approaches to unsupervised goal-reaching, both on prior benchmarks and on a new challenging benchmark with a total of 40 test tasks spanning across four standard robotic manipulation and locomotion domains. LEXA further achieves goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of LEXA, we train a single general agent across four distinct environments. Code and videos at https://orybkin.github.io/lexa/

翻译：人工代理人如何在没有监督的情况下在复杂的视觉环境中学会在没有监督的情况下在复杂的视觉环境中解决许多不同的任务? 我们将这一问题分解成两个问题: 发现新的目标并学习可靠地实现这些目标。我们引入了“ 冷藏探索者”, 这是从图像输入中学习世界模型的统一解决方案, 并用它来培训一个探索者, 和从想象的推出中学习一个“ 实现者” 政策。与以前通过到达以前访问过的国家来探索的方法不同, 探索者计划通过预见来发现不可见的令人惊奇的国家, 这些国家随后被用作实现者所要实践的不同目标。在未受监督的阶段之后, LEXA 在没有任何额外学习的情况下解决了指定为目标图像零照的任务。 LEXA 大大优于以前在前几个基准和新的具有挑战性的基准上, 共40项测试任务, 跨越了四个标准的机器人操纵和定位区域。 LEXA 进一步实现了需要与多个天体按顺序进行互动的目标。最后, 我们训练了四个不同环境的单一普通代理。代码和视频。 http:// http:// abrivelistria/ volvas。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！MILA最新《自监督表示学习》课程，附PPT与视频下载

专知会员服务

90+阅读 · 2020年12月21日