Recent years have witnessed remarkable achievements in perceptual image restoration (IR), creating an urgent demand for accurate image quality assessment (IQA), which is essential for both performance comparison and algorithm optimization. Unfortunately, the existing IQA metrics exhibit inherent weakness for IR task, particularly when distinguishing fine-grained quality differences among restored images. To address this dilemma, we contribute the first-of-its-kind fine-grained image quality assessment dataset for image restoration, termed FGRestore, comprising 18,408 restored images across six common IR tasks. Beyond conventional scalar quality scores, FGRestore was also annotated with 30,886 fine-grained pairwise preferences. Based on FGRestore, a comprehensive benchmark was conducted on the existing IQA metrics, which reveal significant inconsistencies between score-based IQA evaluations and the fine-grained restoration quality. Motivated by these findings, we further propose FGResQ, a new IQA model specifically designed for image restoration, which features both coarse-grained score regression and fine-grained quality ranking. Extensive experiments and comparisons demonstrate that FGResQ significantly outperforms state-of-the-art IQA metrics. Codes and model weights have been released in https://sxfly99.github.io/FGResQ-Homepage.
翻译:近年来,感知图像复原领域取得了显著成就,这催生了对准确图像质量评估的迫切需求,该评估对于性能比较与算法优化至关重要。然而,现有图像质量评估指标在图像复原任务中表现出固有缺陷,尤其是在区分复原图像间的细粒度质量差异时。为应对这一困境,我们构建了首个面向图像复原的细粒度图像质量评估数据集,命名为FGRestore,该数据集包含六种常见图像复原任务下的18,408张复原图像。除传统的标量质量评分外,FGRestore还标注了30,886个细粒度成对偏好。基于FGRestore,我们对现有图像质量评估指标进行了全面基准测试,揭示了基于评分的图像质量评估与细粒度复原质量之间存在显著不一致性。受此启发,我们进一步提出了FGResQ——一个专为图像复原设计的全新图像质量评估模型,其特点在于同时包含粗粒度评分回归与细粒度质量排序。大量实验与比较表明,FGResQ显著优于当前最先进的图像质量评估指标。代码与模型权重已发布于https://sxfly99.github.io/FGResQ-Homepage。