Recently, deep convolution neural networks (CNNs) steered face super-resolution methods have achieved great progress in restoring degraded facial details by jointly training with facial priors. However, these methods have some obvious limitations. On the one hand, multi-task joint learning requires additional marking on the dataset, and the introduced prior network will significantly increase the computational cost of the model. On the other hand, the limited receptive field of CNN will reduce the fidelity and naturalness of the reconstructed facial images, resulting in suboptimal reconstructed images. In this work, we propose an efficient CNN-Transformer Cooperation Network (CTCNet) for face super-resolution tasks, which uses the multi-scale connected encoder-decoder architecture as the backbone. Specifically, we first devise a novel Local-Global Feature Cooperation Module (LGCM), which is composed of a Facial Structure Attention Unit (FSAU) and a Transformer block, to promote the consistency of local facial detail and global facial structure restoration simultaneously. Then, we design an efficient Local Feature Refinement Module (LFRM) to enhance the local facial structure information. Finally, to further improve the restoration of fine facial details, we present a Multi-scale Feature Fusion Unit (MFFU) to adaptively fuse the features from different stages in the encoder procedure. Comprehensive evaluations on various datasets have assessed that the proposed CTCNet can outperform other state-of-the-art methods significantly.
翻译:最近,由深相连锁神经网络(CNN)引导的深层神经神经网络(CNNs)面对超分辨率方法(CTCNet)通过面部前端的联合培训,在恢复退化的面部细节方面取得了很大进展。然而,这些方法有一些明显的局限性。一方面,多任务联合学习需要在数据集上加标记,而引入的先前网络将大大增加模型的计算成本。另一方面,CNN有限的可接受域将降低重建面部图像的准确性和自然性,从而导致图像的重建不尽善。在此工作中,我们建议建立一个高效的CNN- Transfrench合作网络(CTNet),用于面部超分辨率任务,使用多尺度连接的编码-解码结构作为主干。具体地说,我们首先设计一个新的本地-全球特征合作模块(LGCM),由一个结构偏差关注股(FSAU)和一个变形块组成,以促进地方面部拟议详细内容和全球面部结构同时恢复的一致性。然后,我们设计一个高效的州面貌调整模块(LFRM),以大大改进地方面部结构结构结构结构,从目前的多级结构。最后,我们可进一步改进多级的多级调整结构。