Iso-Dream:世界模型中的隔离和利用无法控制的视觉动态 (Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models)

World models learn the consequences of actions in vision-based interactive systems. However, in practical scenarios such as autonomous driving, there commonly exists noncontrollable dynamics independent of the action signals, making it difficult to learn effective world models. To tackle this problem, we present a novel reinforcement learning approach named Iso-Dream, which improves the Dream-to-Control framework in two aspects. First, by optimizing the inverse dynamics, we encourage the world model to learn controllable and noncontrollable sources of spatiotemporal changes on isolated state transition branches. Second, we optimize the behavior of the agent on the decoupled latent imaginations of the world model. Specifically, to estimate state values, we roll-out the noncontrollable states into the future and associate them with the current controllable state. In this way, the isolation of dynamics sources can greatly benefit long-horizon decision-making of the agent, such as a self-driving car that can avoid potential risks by anticipating the movement of other vehicles. Experiments show that Iso-Dream is effective in decoupling the mixed dynamics and remarkably outperforms existing approaches in a wide range of visual control and prediction domains.

翻译：世界模型在基于愿景的互动系统中学习行动的后果。然而,在诸如自主驱动等实际情景中,通常存在与行动信号无关的无法控制的动态,因此难以学习有效的世界模型。为了解决这一问题,我们提出了名为Iso-Dream的新型强化学习方法,它从两个方面改进了梦想到控制的框架。首先,通过优化反向动态,我们鼓励世界模型学习在孤立的州过渡分支上可控和不可控制的空间变化的来源。第二,我们优化了代理人在脱钩的世界模型潜在想象力上的行为。具体地说,为了估算国家价值,我们将不可控状态推广到未来,并将它们与当前可控状态联系起来。这样,动态源的孤立可以极大地有利于代理人的长期同步决策,例如,通过预测其他飞行器的移动可以避免潜在风险的自我驱动汽车。实验表明,Iso-Dream在将混合的动态和清晰的视野范围外的视觉控制方法中可以有效脱钩。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日