Single Image Super Resolution (SISR) is a well-researched problem with broad commercial relevance. However, most of the SISR literature focuses on small-size images under 500px, whereas business needs can mandate the generation of very high resolution images. At Expedia Group, we were tasked with generating images of at least 2000px for display on the website, four times greater than the sizes typically reported in the literature. This requirement poses a challenge that state-of-the-art models, validated on small images, have not been proven to handle. In this paper, we investigate solutions to the problem of generating high-quality images for large-scale super resolution in a commercial setting. We find that training a generative adversarial network (GAN) with attention from scratch using a large-scale lodging image data set generates images with high PSNR and SSIM scores. We describe a novel attentional SISR model for large-scale images, A-SRGAN, that uses a Flexible Self Attention layer to enable processing of large-scale images. We also describe a distributed algorithm which speeds up training by around a factor of five.
翻译:单一图像超级分辨率(SISR)是一个研究周密且具有广泛商业相关性的问题,然而,大多数SISSR文献侧重于500px以下的小型图像,而商业需求可以要求生成非常高分辨率的图像。在Expedia Group,我们的任务是制作至少2000px的图像供在网站上显示,比文献中通常报道的大小高出四倍。这一要求构成一个挑战,即对小图像加以验证的最先进的模型没有被证明能够处理。在本文中,我们研究了在商业环境中为大型超级分辨率生成高质量图像的问题的解决办法。我们发现,利用一个大型客服图像数据集从零到零地培训一个有注意的基因对抗网络(GAN)生成了高PSNR和SSIM分数的图像。我们描述了一个用于大型图像的新的关注性SISSR模型A-SRGAN,该模型使用灵活自关注层来处理大型图像。我们还描述了一种分布式算法,加快了五倍左右的培训速度。