Existing models on super-resolution often specialized for one scale, fundamentally limiting their use in practical scenarios. In this paper, we aim to develop a general plugin that can be inserted into existing super-resolution models, conveniently augmenting their ability towards Arbitrary Resolution Image Scaling, thus termed ARIS. We make the following contributions: (i) we propose a transformer-based plugin module, which uses spatial coordinates as query, iteratively attend the low-resolution image feature through cross-attention, and output visual feature for the queried spatial location, resembling an implicit representation for images; (ii) we introduce a novel self-supervised training scheme, that exploits consistency constraints to effectively augment the model's ability for upsampling images towards unseen scales, i.e. ground-truth high-resolution images are not available; (iii) without loss of generality, we inject the proposed ARIS plugin module into several existing models, namely, IPT, SwinIR, and HAT, showing that the resulting models can not only maintain their original performance on fixed scale factor but also extrapolate to unseen scales, substantially outperforming existing any-scale super-resolution models on standard benchmarks, e.g. Urban100, DIV2K, etc.
翻译:在本文中,我们的目标是开发一个通用插件,可以插入现有的超分辨率模型,方便地提高其向任意分辨率图像缩放(ARIS)发展的能力。我们做出以下贡献:(一) 我们提议一个基于变压器的插件模块,该模块使用空间坐标作为查询,通过交叉注意迭接使用低分辨率图像特征,为被查询的空间位置输出低分辨率图像特征,类似图像的隐含表示方式;(二) 我们推出一个新的自我监督培训计划,利用一致性限制有效增强模型向不可见的尺度(即地面图解高分辨率图像)放大图像的能力;(三) 在不丧失一般性的情况下,我们将拟议的ARIS插件模块输入几个现有模型,即IPT、SwinIR和HAT,表明由此产生的模型不仅能够保持其固定尺度要素的原始性能,而且还可以外推至不可见尺度,大大超过任何规模的超分辨率模型。