Increasingly, convolution neural network (CNN) based super resolution models have been proposed for better reconstruction results, but their large model size and complicated structure inhibit their real-time hardware implementation. Current hardware designs are limited to a plain network and suffer from lower quality and high memory bandwidth requirements. This paper proposes a super resolution hardware accelerator with hardware efficient pixel attention that just needs 25.9K parameters and simple structure but achieves 0.38dB better reconstruction images than the widely used FSRCNN. The accelerator adopts full model block wise convolution for full model layer fusion to reduce external memory access to model input and output only. In addition, CNN and pixel attention are well supported by PE arrays with distributed weights. The final implementation can support full HD image reconstruction at 30 frames per second with TSMC 40nm CMOS process.
翻译:以神经网络(CNN)为基础的超分辨率模型日益被提出来,以取得更好的重建成果,但其大型模型规模和复杂结构阻碍了其实时硬件的安装。当前的硬件设计仅限于一个普通网络,其质量和记忆带宽要求也较低。本文件建议使用一个超级分辨率硬件加速器,其硬件高效像素关注点只需要25.9K参数和简单结构,但比广泛使用的FSRCNN的重建图像要好0.38dB。加速器采用全模块智能组合组合,用于完全模型层组合,以减少外部内存获得模型输入和输出的机会。此外,CNN和像素关注得到了配有分布重量的PE阵列的大力支持。最后实施可以支持以30个框架的HD图像全面重建,使用TSMC 40nm CMOS进程。