Latent world models allow agents to reason about complex environments with high-dimensional observations. However, adapting to new environments and effectively leveraging previous knowledge remain significant challenges. We present variational causal dynamics (VCD), a structured world model that exploits the invariance of causal mechanisms across environments to achieve fast and modular adaptation. By causally factorising a transition model, VCD is able to identify reusable components across different environments. This is achieved by combining causal discovery and variational inference to learn a latent representation and transition model jointly in an unsupervised manner. Specifically, we optimise the evidence lower bound jointly over a representation model and a transition model structured as a causal graphical model. In evaluations on simulated environments with state and image observations, we show that VCD is able to successfully identify causal variables, and to discover consistent causal structures across different environments. Moreover, given a small number of observations in a previously unseen, intervened environment, VCD is able to identify the sparse changes in the dynamics and to adapt efficiently. In doing so, VCD significantly extends the capabilities of the current state-of-the-art in latent world models while also comparing favourably in terms of prediction accuracy.
翻译:然而,适应新的环境和有效地利用先前的知识仍然是巨大的挑战。我们展示了因果动力(VCD)这一结构化的世界模型,它利用各环境因果机制的变异性来实现快速和模块适应。通过因果因素推介一个过渡模型,VCD能够查明不同环境中可重复使用的成分。这是通过将因果发现和变异推论结合起来,以不受监督的方式共同学习潜在代表性和过渡模型来实现的。具体地说,我们优化了证据,使证据在代表模型和作为因果图形模型结构的过渡模型上联合约束较低。在以状态和图像观察对模拟环境进行评估时,我们表明VCD能够成功地发现因果变量,并发现不同环境中的一致因果结构。此外,由于在先前不为人知、受干扰的环境中观测到很少,VCD能够识别动态的微小变化并有效适应。在这样做时,VCD大大扩展了潜在世界模型中当前状态的能力,同时在预测的准确性方面进行了积极的比较。