Flow-based generative super-resolution (SR) models learn to produce a diverse set of feasible SR solutions, called the SR space. Diversity of SR solutions increases with the temperature ($\tau$) of latent variables, which introduces random variations of texture among sample solutions, resulting in visual artifacts and low fidelity. In this paper, we present a simple but effective image ensembling/fusion approach to obtain a single SR image eliminating random artifacts and improving fidelity without significantly compromising perceptual quality. We achieve this by benefiting from a diverse set of feasible photo-realistic solutions in the SR space spanned by flow models. We propose different image ensembling and fusion strategies which offer multiple paths to move sample solutions in the SR space to more desired destinations in the perception-distortion plane in a controllable manner depending on the fidelity vs. perceptual quality requirements of the task at hand. Experimental results demonstrate that our image ensembling/fusion strategy achieves more promising perception-distortion trade-off compared to sample SR images produced by flow models and adversarially trained models in terms of both quantitative metrics and visual quality.
翻译:在本文中,我们提出了一个简单而有效的图像组合/聚合方法,以获得单一的SR图像,消除随机文物,提高忠诚度,同时又不显著地损害感知质量。我们通过利用由流模式覆盖的SR空间中一套多样的可行的摄影现实化解决方案来实现这一点。我们提出了不同的图像组合和聚合战略,提供多种途径,将斯洛伐克共和国空间中的样本解决方案移到感知扭曲平面中更理想的目的地,这取决于对等性与感知质量的要求。实验结果表明,我们的形象融合/融合战略比流动模型和对立性质量培训的定量和定量模型生成的样本性SR图像更加有希望的认知扭曲性交易。