The use of vision transformers (ViT) in computer vision is increasing due to limited inductive biases (e.g., locality, weight sharing, etc.) and increased scalability compared to other deep learning methods (e.g., convolutional neural networks (CNN)). This has led to some initial studies on the use of ViT for biometric recognition, including fingerprint recognition. In this work, we improve on these initial studies for transformers in fingerprint recognition by i.) evaluating additional attention-based architectures in addition to vanilla ViT, ii.) scaling to larger and more diverse training and evaluation datasets, and iii.) combining the complimentary representations of attention-based and CNN-based embeddings for improved state-of-the-art (SOTA) fingerprint recognition for both authentication (1:1 comparisons) and identification (1:N comparisions). Our combined architecture, AFR-Net (Attention-Driven Fingerprint Recognition Network), outperforms several baseline transformer and CNN-based models, including a SOTA commercial fingerprint system, Verifinger v12.3, across many intra-sensor, cross-sensor (including contact to contactless), and latent to rolled fingerprint matching datasets. Additionally, we propose a realignment strategy using local embeddings extracted from intermediate feature maps within the networks to refine the global embeddings in low certainty situations, which boosts the overall recognition accuracy significantly for all the evaluations across each of the models. This realignment strategy requires no additional training and can be applied as a wrapper to any existing deep learning network (including attention-based, CNN-based, or both) to boost its performance.
翻译:在计算机视觉中,由于有限的感官偏差(如地点、重量共享等)和与其他深层学习方法(如进化神经网络(CNN))相比的可缩放性提高(ViT),计算机视野中视觉变压器的使用越来越多,这导致对ViT用于生物鉴别识别,包括指纹识别的一些初步研究。在这项工作中,我们在指纹识别方面改进了这些对变压器的初步研究(一)除了香草ViT外,还评估更多的关注型结构(二),推广到更多和更多样化的培训和评价数据集,以及(三)将基于关注的和基于CNN的嵌入式嵌入式模块的辅助性表述结合起来,以改进状态神经神经神经神经网络(1:1比较)和识别(1:Ncompailis)。我们的综合结构,AFR-Net(基于维护-Driven的指纹识别网),超越了几个基于Savilla Viberprint的精度变压器和CNN模型,包括SOTA商业指纹系统、Verifings-Grual-lifing vical viewer estal-real real real netwolation commessermess remission),这需要从许多的升级到升级到升级的升级的升级的升级的升级到升级的升级网络,并提议。