Image super-resolution (SR) has been widely investigated in recent years. However, it is challenging to fairly estimate the performances of various SR methods, as the lack of reliable and accurate criteria for perceptual quality. Existing SR image quality assessment (IQA) metrics usually concentrate on the specific kind of degradation without distinguishing the visual sensitive areas, which have no adaptive ability to describe the diverse SR degeneration situations. In this paper, we focus on the textural and structural degradation of image SR which acts as a critical role for visual perception, and design a dual stream network to jointly explore the textural and structural information for quality prediction, dubbed TSNet. By mimicking the human vision system (HVS) that pays more attention to the significant areas of the image, we develop the spatial attention mechanism to make the visual-sensitive areas more distinguishable, which improves the prediction accuracy. Feature normalization (F-Norm) is also developed to investigate the inherent spatial correlation of SR features and boost the network representation capacity. Experimental results show the proposed TSNet predicts the visual quality more accurate than the state-of-the-art IQA methods, and demonstrates better consistency with the human's perspective. The source code will be made available at http://github.com/yuqing-liu-dut/NRIQA_SR.
翻译:近年来,对超分辨率图像进行了广泛调查(SR),然而,由于缺乏可靠和准确的感知质量标准,很难对各种SR方法的性能进行公平估计。现有的SR图像质量评估(IQA)衡量标准通常集中于特定类型的退化,而没有区分视觉敏感地区,没有适应能力来描述不同的SR退化情况。在本文件中,我们侧重于作为视觉感知关键作用的图像SR的质貌和结构退化,并设计一个双流网络,共同探索质量预测的文本和结构信息,称为TSNet。通过模拟更多关注图像重要领域的人类视觉系统(HVS),我们开发空间关注机制,使视觉敏感地区更加可辨别,从而改进预测准确性。还开发了地貌正常化(F-Norm),以调查SR特征固有的空间相关性,提升网络代表能力。实验结果显示拟议的TSNet预测的视觉质量比国家-A-A-A-A-Art/MAQ 数据源的准确性将展示更好的一致性。