The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human quality judgments. Perceptual datasets gathered for this purpose have provided useful benchmarks for improving IQA methods, but their heavy use creates a risk of overfitting. Here, we perform a large-scale comparison of IQA models in terms of their use as objectives for the optimization of image processing algorithms. Specifically, we use eleven full-reference IQA models to train deep neural networks for four low-level vision tasks: denoising, deblurring, super-resolution, and compression. Subjective testing on the optimized images allows us to rank the competing models in terms of their perceptual performance, elucidate their relative advantages and disadvantages in these tasks, and propose a set of desirable properties for incorporation into future IQA models.
翻译:客观图像质量评估模型(IQA)的绩效主要通过将模型预测与人的质量判断进行比较来评估,为此目的收集的感知数据集为改进IQA方法提供了有用的基准,但大量使用这些数据集会造成过度适应的风险。在这里,我们从将IQA模型用作优化图像处理算法的目标的角度对之进行大规模比较。具体地说,我们使用11个全面参照IQA模型来培训深神经网络,以完成四项低层次的愿景任务:拆卸、拆卸、超分辨率和压缩。对优化图像的主观测试使我们能够根据这些模型的感知性能对相互竞争的模型进行排序,阐明这些模型在这些任务中的相对优缺点,并提出一套可取的属性,供今后IQA模型使用。