Learning world models from their sensory inputs enables agents to plan for actions by imagining their future outcomes. World models have previously been shown to improve sample-efficiency in simulated environments with few objects, but have not yet been applied successfully to environments with many objects. In environments with many objects, often only a small number of them are moving or interacting at the same time. In this paper, we investigate integrating this inductive bias of sparse interactions into the latent dynamics of world models trained from pixels. First, we introduce Variational Sparse Gating (VSG), a latent dynamics model that updates its feature dimensions sparsely through stochastic binary gates. Moreover, we propose a simplified architecture Simple Variational Sparse Gating (SVSG) that removes the deterministic pathway of previous models, resulting in a fully stochastic transition function that leverages the VSG mechanism. We evaluate the two model architectures in the BringBackShapes (BBS) environment that features a large number of moving objects and partial observability, demonstrating clear improvements over prior models.
翻译:从感官投入中学习世界模型使代理商能够通过想象其未来结果来规划行动。世界模型过去曾被证明可以提高模拟环境中的样本效率,但还没有成功地应用于多个天体的环境。在许多天体的环境中,通常只有一小部分天体同时移动或互动。在本文中,我们调查将这种稀疏互动的感应偏差融入从像素中培训的世界模型的潜在动态。首先,我们引入了变形散开色(VSG),这是一种潜伏动态模型,通过随机二进制门稀疏地更新其特征维度。此外,我们提出了一个简化的建筑简单变异开式开关(SVSG),清除了先前模型的确定性路径,从而形成一种完全随机转换功能,利用VSG机制。我们评估了BringBackShapes (BBS) 环境中的两个模型结构,该模型有许多移动天体和部分可耐性,展示了相对于先前模型的明显改进。