Humans are highly efficient learners, with the ability to grasp the meaning of a new concept from just a few examples. Unlike popular computer vision systems, humans can flexibly leverage the compositional structure of the visual world, understanding new concepts as combinations of existing concepts. In the current paper, we study how people learn different types of visual compositions, using abstract visual forms with rich relational structure. We find that people can make meaningful compositional generalizations from just a few examples in a variety of scenarios, and we develop a Bayesian program induction model that provides a close fit to the behavioral data. Unlike past work examining special cases of compositionality, our work shows how a single computational approach can account for many distinct types of compositional generalization.
翻译:人类是高度高效的学习者,有能力从几个例子中理解新概念的含义。与流行的计算机视觉系统不同,人类可以灵活地利用视觉世界的构成结构,理解新概念作为现有概念的组合。在本论文中,我们研究人们如何学习不同类型的视觉构成,使用具有丰富关系结构的抽象视觉形式。我们发现,人们可以从多种情景中仅从几个例子中做出有意义的构成概括,我们开发了一种与行为数据非常相适应的巴伊西亚程序诱导模型。与以往研究组成特点的工作不同,我们的工作表明,单一计算方法可以说明许多不同类型的组成概括。