Deep learning-based single image super-resolution (SISR) approaches have drawn much attention and achieved remarkable success on modern advanced GPUs. However, most state-of-the-art methods require a huge number of parameters, memories, and computational resources, which usually show inferior inference times when applying them to current mobile device CPUs/NPUs. In this paper, we propose a simple plain convolution network with a fast nearest convolution module (NCNet), which is NPU-friendly and can perform a reliable super-resolution in real-time. The proposed nearest convolution has the same performance as the nearest upsampling but is much faster and more suitable for Android NNAPI. Our model can be easily deployed on mobile devices with 8-bit quantization and is fully compatible with all major mobile AI accelerators. Moreover, we conduct comprehensive experiments on different tensor operations on a mobile device to illustrate the efficiency of our network architecture. Our NCNet is trained and validated on the DIV2K 3x dataset, and the comparison with other efficient SR methods demonstrated that the NCNet can achieve high fidelity SR results while using fewer inference times. Our codes and pretrained models are publicly available at \url{https://github.com/Algolzw/NCNet}.
翻译:深度学习的单一图像超分辨率(SISR)方法在现代先进的高级GPU上引起了许多注意并取得了显著的成功。然而,大多数最先进的方法需要大量参数、记忆和计算资源,通常在应用到目前移动设备CPU/NPU时,这些参数、记忆和计算资源通常显示低推推推推时间。在本文中,我们提议建立一个简单简单的平面变速网络,拥有一个离我们较近的共变模块(NCNet),该模块对NPU友好,可以实时执行可靠的超级分辨率。拟议的最近的最接近的变速技术与最近的升级功能相同,但对于Android NNAPI则更快、更合适。我们的模型可以很容易地安装在8位四分法的移动设备上,并且与所有主要的移动设备 CPU/NPA加速器完全兼容。此外,我们还在移动设备上对不同的发声器操作进行了全面实验,以说明我们的网络结构的效率。我们的NPNet网络在DIV2K 3x数据集上接受培训和验证,与其他高效的SR方法的比较表明NCNet模型可以在公共上达到高忠实的版本。