Deep networks have demonstrated promising results in the field of Image Quality Assessment (IQA). However, there has been limited research on understanding how deep models in IQA work. This study introduces a novel positional masked transformer for IQA and provides insights into the contribution of different regions of an image towards its overall quality. Results indicate that half of an image may play a trivial role in determining image quality, while the other half is critical. This observation is extended to several other CNN-based IQA models, revealing that half of the image regions can significantly impact the overall image quality. To further enhance our understanding, three semantic measures (saliency, frequency, and objectness) were derived and found to have high correlation with the importance of image regions in IQA.
翻译:深层网络在图像质量评估(IQA)领域展示了可喜的成果。然而,关于了解IQA工作的深度模型的研究有限。本研究为IQA引入了一个新的定位遮蔽变压器,为不同区域图像对其总体质量的贡献提供了深刻见解。结果显示,半个图像在确定图像质量方面可能起到微不足道的作用,而另一半则至关重要。这一观察还扩展到了其他几个基于CNN IQA的模型,表明一半图像区域可以显著影响总体图像质量。为了进一步加深我们的理解,得出并发现三种语义衡量标准(性质、频率和对象性)与IQA图像区域的重要性有着高度关联。