Compositional generalization is a critical ability in learning and decision-making. We focus on the setting of reinforcement learning in object-oriented environments to study compositional generalization in world modeling. We (1) formalize the compositional generalization problem with an algebraic approach and (2) study how a world model can achieve that. We introduce a conceptual environment, Object Library, and two instances, and deploy a principled pipeline to measure the generalization ability. Motivated by the formulation, we analyze several methods with exact or no compositional generalization ability using our framework, and design a differentiable approach, Homomorphic Object-oriented World Model (HOWM), that achieves soft but more efficient compositional generalization.
翻译:整体化是学习和决策的关键能力。我们注重在面向目标的环境中设置强化学习,以研究世界建模中的组成概括化。我们(1) 以代数法将组成概括化问题正式化,(2) 研究世界模型如何实现这一目标。我们引入了概念环境,即对象图书馆和两个实例,并运用一条原则性管道来测量一般化能力。我们受该提法的驱动,利用我们的框架,分析几种精确或无构成概括化能力的方法,并设计一种不同的方法,即面向目标的单向型世界模型(HOWM),实现软而更高效的简单化。