Humans have the remarkable ability to recognize and acquire novel visual concepts in a zero-shot manner. Given a high-level, symbolic description of a novel concept in terms of previously learned visual concepts and their relations, humans can recognize novel concepts without seeing any examples. Moreover, they can acquire new concepts by parsing and communicating symbolic structures using learned visual concepts and relations. Endowing these capabilities in machines is pivotal in improving their generalization capability at inference time. In this work, we introduce Zero-shot Concept Recognition and Acquisition (ZeroC), a neuro-symbolic architecture that can recognize and acquire novel concepts in a zero-shot way. ZeroC represents concepts as graphs of constituent concept models (as nodes) and their relations (as edges). To allow inference time composition, we employ energy-based models (EBMs) to model concepts and relations. We design ZeroC architecture so that it allows a one-to-one mapping between a symbolic graph structure of a concept and its corresponding EBM, which for the first time, allows acquiring new concepts, communicating its graph structure, and applying it to classification and detection tasks (even across domains) at inference time. We introduce algorithms for learning and inference with ZeroC. We evaluate ZeroC on a challenging grid-world dataset which is designed to probe zero-shot concept recognition and acquisition, and demonstrate its capability.
翻译:人类具有以零发方式认识和获得新视觉概念的非凡能力。鉴于对新概念的高度象征性描述,人类可以以以前学过视觉概念及其关系进行高层次、象征性的描述,人类可以在不见任何实例的情况下承认新概念。此外,他们还可以通过利用已学过视觉概念和关系,通过分析和传播象征性结构来获得新概念。在推论时间,赋予机器这些能力对于提高其一般化能力至关重要。在这项工作中,我们引入了零弹概念识别和获取(ZeroC)这个神经同步结构,它能够以零发方式认识和获得新概念。 ZeroC代表了构成概念模型(作为节点)及其关系(作为边缘点)的图表概念。为了允许推断时间结构,我们使用基于能源的模型(EBMS)来模拟概念和关系。我们设计了ZeroC结构,以便能够在概念的象征性图表结构及其相应的EBM结构之间进行一对一对一对一的绘图,从而首次获得新概念,传播其图表结构结构,并将它应用为构成构思论模型和对Z区域域的分类和测算。我们所设计的系统化和测算中,我们用C在时间和测算中进行了一个具有挑战性的任务。我们所设计的域域域的域中,我们所设计和测算。