Although numerous solutions have been proposed for image super-resolution, they are usually incompatible with low-power devices with many computational and memory constraints. In this paper, we address this problem by proposing a simple yet effective deep network to solve image super-resolution efficiently. In detail, we develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block. Within it, we first apply the SAFM block over input features to dynamically select representative feature representations. As the SAFM block processes the input features from a long-range perspective, we further introduce a convolutional channel mixer (CCM) to simultaneously extract local contextual information and perform channel mixing. Extensive experimental results show that the proposed method is $3\times$ smaller than state-of-the-art efficient SR methods, e.g., IMDN, in terms of the network parameters and requires less computational cost while achieving comparable performance. The code is available at https://github.com/sunny2109/SAFMN.
翻译:虽然为图像超分辨率提出了许多解决方案,但它们通常与许多计算和内存限制的低功率装置不相容。在本文中,我们提出一个简单而有效的深深网络,以高效解决图像超分辨率问题,从而解决这一问题。我们详细开发了一个类似视觉变压器的空间适应性特征调制(SAFM)机制。我们首先将SAFM块用于动态选择的有代表性的特征演示。随着SAFM块从远程角度处理输入特征,我们进一步引入了一种同步频道混音器(CCM),以同时提取本地背景信息并进行频道混合。广泛的实验结果显示,拟议方法在网络参数方面比最先进的高效SR方法(例如IMDN)小3美元,并且在实现可比性能时要求较少计算成本。代码可在https://github.com/suny2109/SAFMN查阅。</s>