Cartoon face recognition is challenging as they typically have smooth color regions and emphasized edges, the key to recognize cartoon faces is to precisely perceive their sparse and critical shape patterns. However, it is quite difficult to learn a shape-oriented representation for cartoon face recognition with convolutional neural networks (CNNs). To mitigate this issue, we propose the GraphJigsaw that constructs jigsaw puzzles at various stages in the classification network and solves the puzzles with the graph convolutional network (GCN) in a progressive manner. Solving the puzzles requires the model to spot the shape patterns of the cartoon faces as the texture information is quite limited. The key idea of GraphJigsaw is constructing a jigsaw puzzle by randomly shuffling the intermediate convolutional feature maps in the spatial dimension and exploiting the GCN to reason and recover the correct layout of the jigsaw fragments in a self-supervised manner. The proposed GraphJigsaw avoids training the classification model with the deconstructed images that would introduce noisy patterns and are harmful for the final classification. Specially, GraphJigsaw can be incorporated at various stages in a top-down manner within the classification model, which facilitates propagating the learned shape patterns gradually. GraphJigsaw does not rely on any extra manual annotation during the training process and incorporates no extra computation burden at inference time. Both quantitative and qualitative experimental results have verified the feasibility of our proposed GraphJigsaw, which consistently outperforms other face recognition or jigsaw-based methods on two popular cartoon face datasets with considerable improvements.
翻译:卡通面部的识别具有挑战性,因为它们通常具有平滑的彩色区域,而且强调边缘,因此,要识别卡通面孔的关键是精确地看到其稀疏和关键形状模式。然而,很难学习以形状为主的卡通脸辨识与卷动神经网络(CNNs)相容。为了缓解这一问题,我们建议Greaph Jigsaw在分类网络的各个阶段制造拼图拼图拼图,以渐进的方式解决图形卷动网络(GCN)的拼图。解决谜题需要模型来识别卡通面部的形状模式,因为纹理信息非常有限。GreaGJigsaw的关键理念是通过随机地拼动空间层面的中间革命特征地图来构建拼图拼图拼图。我们建议Gigsaw碎片在分类网络的各个阶段以自我超强的方式重建拼图拼图的正确布局。拟议的GregJigsaw避免用解式图像对分类模型进行培训,这些解译结果会引入更密的模型,而且对最终分类过程有害。特别的是,GregingJigsmarsorformod roformod 在最后分类中, 解解的变校正的模型中,在升级过程中可以逐步地纳入一个不易的校正的校正的校正的校正的校正的校正。