Facial recognition systems have achieved remarkable success by leveraging deep neural networks, advanced loss functions, and large-scale datasets. However, their performance often deteriorates in real-world scenarios involving low-quality facial images. Such degradations, common in surveillance footage or standoff imaging include low resolution, motion blur, and various distortions, resulting in a substantial domain gap from the high-quality data typically used during training. While existing approaches attempt to address robustness by modifying network architectures or modeling global spatial transformations, they frequently overlook local, non-rigid deformations that are inherently present in real-world settings. In this work, we introduce \textbf{DArFace}, a \textbf{D}eformation-\textbf{A}ware \textbf{r}obust \textbf{Face} recognition framework that enhances robustness to such degradations without requiring paired high- and low-quality training samples. Our method adversarially integrates both global transformations (e.g., rotation, translation) and local elastic deformations during training to simulate realistic low-quality conditions. Moreover, we introduce a contrastive objective to enforce identity consistency across different deformed views. Extensive evaluations on low-quality benchmarks including TinyFace, IJB-B, and IJB-C demonstrate that DArFace surpasses state-of-the-art methods, with significant gains attributed to the inclusion of local deformation modeling.
翻译:人脸识别系统通过利用深度神经网络、先进的损失函数和大规模数据集已取得显著成功。然而,在涉及低质量人脸图像的实际场景中,其性能常出现显著下降。此类退化(常见于监控视频或远距离成像)包括低分辨率、运动模糊及多种失真,导致与训练通常使用的高质量数据间存在显著领域差异。现有方法虽尝试通过修改网络架构或建模全局空间变换来提升鲁棒性,但往往忽略了实际场景中固有的局部非刚性形变。本文提出\\textbf{DArFace},一种\\textbf{形变感知鲁棒人脸识别}框架,可在无需配对高低质量训练样本的情况下增强对此类退化的鲁棒性。该方法在训练中对抗性地整合全局变换(如旋转、平移)与局部弹性形变,以模拟真实的低质量条件。此外,我们引入对比学习目标来强制不同形变视角间的身份一致性。在TinyFace、IJB-B和IJB-C等低质量基准上的广泛实验表明,DArFace超越了现有最优方法,其显著性能提升主要归因于局部形变建模的引入。