Compositional generalization is a critical ability in learning and decision-making. We focus on the setting of reinforcement learning in object-oriented environments to study compositional generalization in world modeling. We (1) formalize the compositional generalization problem with an algebraic approach and (2) study how a world model can achieve that. We introduce a conceptual environment, Object Library, and two instances, and deploy a principled pipeline to measure the generalization ability. Motivated by the formulation, we analyze several methods with exact} or no compositional generalization ability using our framework, and design a differentiable approach, Homomorphic Object-oriented World Model (HOWM), that achieves approximate but more efficient compositional generalization.
翻译:整体化是学习和决策的关键能力。我们侧重于在面向目标的环境中设置强化学习,以研究世界建模中的组成概括化。我们(1) 以代数法将组成概括化问题正式化,(2) 研究世界模型如何实现这一目标。我们引入了概念环境,即目标图书馆和两个实例,并运用一条有原则的管道来测量一般化能力。我们受该提法的驱动,我们利用我们的框架分析几种方法,有精确性,或没有精确性,或没有使用我们的框架来分析组成概括化能力,并设计一种不同的方法,即面向目标的单向世界模型(HOWM),实现近似但更有效率的概括化。