We propose Recognition as Part Composition (RPC), an image encoding approach inspired by human cognition. It is based on the cognitive theory that humans recognize complex objects by components, and that they build a small compact vocabulary of concepts to represent each instance with. RPC encodes images by first decomposing them into salient parts, and then encoding each part as a mixture of a small number of prototypes, each representing a certain concept. We find that this type of learning inspired by human cognition can overcome hurdles faced by deep convolutional networks in low-shot generalization tasks, like zero-shot learning, few-shot learning and unsupervised domain adaptation. Furthermore, we find a classifier using an RPC image encoder is fairly robust to adversarial attacks, that deep neural networks are known to be prone to. Given that our image encoding principle is based on human cognition, one would expect the encodings to be interpretable by humans, which we find to be the case via crowd-sourcing experiments. Finally, we propose an application of these interpretable encodings in the form of generating synthetic attribute annotations for evaluating zero-shot learning methods on new datasets.
翻译:我们提议承认为部分构成(RPC),这是一种由人类认知启发的图像编码方法。它基于认知理论,人类承认各组成部分的复杂物体,并构建了一个能代表每个实例的概念的小型缩略词。RPC首先将图像分解成突出部分,然后将每个部分编码成少量原型的混合体,每个原型代表一个特定概念。我们发现,由人类认知所启发的这种类型的学习可以克服低发常规化任务中深层革命网络所面临的障碍,例如零光学习、少发学习和不受监督域适应。此外,我们发现使用RPC图像编码器的分类器对对抗性攻击相当有力,深层神经网络是众所周知的。鉴于我们图像编码原则是以人类认知为基础的,人们会期望人类能够解释这种编码,我们发现通过众包实验可以解释。最后,我们建议应用这些可解释的编码,在评估零光学方法上生成合成属性说明。