World models for environments with many objects face a combinatorial explosion of states: as the number of objects increases, the number of possible arrangements grows exponentially. In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of objects. Previous object-factored models were limited either by their inability to model actions, or by their inability to plan for complex manipulation tasks. We build on recent contrastive methods for training object-factored world models, which we extend to model continuous robot actions and to accurately predict the physics of robotic pick-and-place. To do so, we use a residual stack of graph neural networks that receive action information at multiple levels in both their node and edge neural networks. Crucially, our learned model can make predictions about tasks not represented in the training data. That is, we demonstrate successful zero-shot generalization to novel tasks, with only a minor decrease in model performance. Moreover, we show that an ensemble of our models can be used to plan for tasks involving up to 12 pick and place actions using heuristic search. We also demonstrate transfer to a physical robot.
翻译:许多天体环绕的环境模型面临一连串的爆炸:随着天体数量的增加,可能的安排数量会成倍增长。在本文件中,我们学会使用由天体因素组成的世界模型对机器人选取和定位任务进行概括化,这些模型通过确保预测与天体变形等异性来对抗组合式爆炸。以前的天体因素模型要么由于无法模拟动作,要么由于无法规划复杂的操作任务而受到限制。我们借鉴了最近用来培训由天体因素决定的世界模型的对比性方法,我们将这些方法推广到模拟连续机器人动作和准确预测机器人选取和定位的物理。为了这样做,我们使用了一组残余的图形神经网络,在它们的节点和边缘神经网络的多个层次上接收行动信息。重要的是,我们所学的模型可以对培训数据中不包含的任务作出预测。也就是说,我们展示了成功的零光化的简单化为新任务,而模型的性能则稍有下降。此外,我们展示了我们模型的精度可以用来用来用物理图案式的图案式转换到12号。