Deep networks show promising performance in image quality assessment (IQA), whereas few studies have investigated how a deep model works. In this work, a positional masked transformer for IQA is first developed, based on which we observe that half of an image might contribute trivially to image quality, whereas the other half is crucial. Such observation is generalized to that half of the image regions can dominate image quality in several CNN-based IQA models. Motivated by this observation, three semantic measures (saliency, frequency, objectness) are then derived, showing high accordance with importance degree of image regions in IQA.
翻译:深层网络在图像质量评估(IQA)方面表现良好,而调查深层模型如何运作的研究却寥寥无几。 在这项工作中,首次开发了IQA的定位遮蔽变压器,我们据此发现,半个图像可能对图像质量贡献不大,而另一半则至关重要。这种观察被概括到,在有线电视新闻网(CNN)的IQA模型中,半个图像区域可以主宰图像质量。受这一观察的驱动,随后得出三种语义测量法(均匀性、频率、目标性),显示高水平的IQA图像区域的重要性。