Learning continuous image representations is recently gaining popularity for image super-resolution (SR) because of its ability to reconstruct high-resolution images with arbitrary scales from low-resolution inputs. Existing methods mostly ensemble nearby features to predict the new pixel at any queried coordinate in the SR image. Such a local ensemble suffers from some limitations: i) it has no learnable parameters and it neglects the similarity of the visual features; ii) it has a limited receptive field and cannot ensemble relevant features in a large field which are important in an image. To address these issues, this paper proposes a continuous implicit attention-in-attention network, called CiaoSR. We explicitly design an implicit attention network to learn the ensemble weights for the nearby local features. Furthermore, we embed a scale-aware attention in this implicit attention network to exploit additional non-local information. Extensive experiments on benchmark datasets demonstrate CiaoSR significantly outperforms the existing single image SR methods with the same backbone. In addition, CiaoSR also achieves the state-of-the-art performance on the arbitrary-scale SR task. The effectiveness of the method is also demonstrated on the real-world SR setting. More importantly, CiaoSR can be flexibly integrated into any backbone to improve the SR performance.
翻译:近来,学习连续图像表示在图像超分辨率(SR)中变得越来越受欢迎,因为它能够从低分辨率输入重建任意尺度的高分辨率图像。现有的方法主要对附近的特征进行合集,以预测SR图像中任何查询坐标的新像素。这样的局部合集存在一些限制:i)它没有可学习参数,并忽略了视觉特征的相似性; ii)它具有有限的感受野,不能合集重要的大域中的相关特征。为了解决这些问题,本文提出了一个连续隐式注意力嵌套网络,称为CiaoSR。 我们明确设计一个隐式注意力网络来学习周围局部特征的合集权重。此外,我们在这个隐式注意力网络中嵌入了一个尺度感知注意力来利用附加的非局部信息。对基准数据集的广泛实验表明,CiaoSR显著优于现有的具有相同骨干结构的单一图像SR方法。此外,CiaoSR还在任意尺度SR任务上达到了最先进的性能。该方法的有效性也在实际SR环境中得到了证明。更重要的是,CiaoSR可以灵活地集成到任何骨干结构中,以提高SR性能。