The existing deep learning models suffer from out-of-distribution (o.o.d.) performance drop in computer vision tasks. In comparison, humans have a remarkable ability to interpret images, even if the scenes in the images are rare, thanks to the systematicity of acquired knowledge. This work focuses on 1) the acquisition of systematic knowledge of 2D transformations, and 2) architectural components that can leverage the learned knowledge in image classification tasks in an o.o.d. setting. With a new training methodology based on synthetic datasets that are constructed under the causal framework, the deep neural networks acquire knowledge from semantically different domains (e.g. even from noise), and exhibit certain level of systematicity in parameter estimation experiments. Based on this, a novel architecture is devised consisting of a classifier, an estimator and an identifier (abbreviated as "CED"). By emulating the "hypothesis-verification" process in human visual perception, CED improves the classification accuracy significantly on test sets under covariate shift.
翻译:现有的深层次学习模型在计算机视觉任务中受到不分布(o.o.d.)性能下降的影响。相比之下,由于获得的知识的系统化,人类有非凡的能力来解释图像,即使图像中的场景很少,由于获得的知识的系统化,这项工作侧重于1)获得关于2D变异的系统知识,和2)建筑组成部分,这些组成部分能够在o.o.d.设置中利用在图像分类任务方面学到的知识。随着以因果框架下构建的合成数据集为基础的新的培训方法,深层神经网络从语义上的不同领域(例如甚至从噪音中)获得知识,并在参数估计实验中表现出某种程度的系统性。在此基础上,设计了一个由分类器、估计器和识别器(作为“CED”的缩写符)组成的新结构。通过模拟人类视觉认知中的“ypothes-验证”过程,CEDD通过模拟人类视觉认知中的“ypothes-验证”过程,极大地改进了测试组的分类准确性。