Given two images of different anime roles, anime style recognition (ASR) aims to learn abstract painting style to determine whether the two images are from the same work, which is an interesting but challenging problem. Unlike biometric recognition, such as face recognition, iris recognition, and person re-identification, ASR suffers from a much larger semantic gap but receives less attention. In this paper, we propose a challenging ASR benchmark. Firstly, we collect a large-scale ASR dataset (LSASRD), which contains 20,937 images of 190 anime works and each work at least has ten different roles. In addition to the large-scale, LSASRD contains a list of challenging factors, such as complex illuminations, various poses, theatrical colors and exaggerated compositions. Secondly, we design a cross-role protocol to evaluate ASR performance, in which query and gallery images must come from different roles to validate an ASR model is to learn abstract painting style rather than learn discriminative features of roles. Finally, we apply two powerful person re-identification methods, namely, AGW and TransReID, to construct the baseline performance on LSASRD. Surprisingly, the recent transformer model (i.e., TransReID) only acquires a 42.24% mAP on LSASRD. Therefore, we believe that the ASR task of a huge semantic gap deserves deep and long-term research. We will open our dataset and code at https://github.com/nkjcqvcpi/ASR.
翻译:鉴于有两种不同元素作用的图像, anime风格识别( ASR)旨在学习抽象的绘画风格, 以确定这两幅图像是否来自同一工作, 这是一个有趣的但富有挑战性的问题。 与生物鉴别识别不同, 如面部识别、 虹膜识别、 和人再识别, ASR 有着更大的语义差距, 但却没有受到更多的关注。 在本文中, 我们提出了一个具有挑战性的 ASR 基准。 首先, 我们收集了一个大型 ASR 数据集( LSARRD ), 它包含 190 anime 作品的20, 937 图像, 并且每个作品至少有十个不同的角色。 除了大型的图像外, LSARRRD 包含一系列具有挑战性的因素, 例如复杂的光度识别、 各种姿势、 戏剧色彩和夸大的构成。 其次, 我们设计了一个跨功能模型, 学习抽象的绘画风格, 而不是对角色的区别性特征。 最后, 我们应用两种强大的人重新定位方法, 即 AGW 和 TransreID, 来构建关于 ASARSA 长期的模型。