One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention. In recent studies, scholars have primarily focused on extracting features of images from probabilistically distributed inputs with pure convolutional neural networks (CNNs). However, it is quite difficult for CNNs with limited receptive domain to extract and maintain the global structural information. Therefore, in this paper, we propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods. Specifically, TcGAN preserves global structure of an image during training to be compatible with local details while maintaining the integrity of semantic-aware information by exploiting the powerful long-range dependencies modeling capability of the transformer. We also propose a new scaling formula having scale-invariance during the calculation period, which effectively improves the generated image quality of the OSG model on image super-resolution tasks. We present the design of the TcGAN converter framework, comprehensive experimental as well as ablation studies demonstrating the ability of TcGAN to achieve arbitrary image generation with the fastest running time. Lastly, TcGAN achieves the most excellent performance in terms of applying it to other image processing tasks, e.g., super-resolution as well as image harmonization, the results further prove its superiority.
翻译:在最近的研究中,学者们主要侧重于从纯革命性神经网络(CNNs)的概率分布投入中提取图像的特征。然而,对于接受范围有限的CNN人来说,很难提取和维护全球结构信息。因此,在本文件中,我们提议采用一种结构维护的新方法TcGAN,配有个人愿景变异器,以克服现有一发图像生成方法的缺陷。具体地说,TcGAN在培训期间保留一种全球图像结构,以便与当地细节兼容,同时利用变异器强大的远程依赖性模型能力,保持语义认知信息的完整性。我们还建议在计算期间采用一个规模变化性公式,从而有效地提高OSG模型在图像超级分辨率任务方面产生的图像质量。我们介绍了TcGAN转换框架的设计,全面实验及电子化在培训期间保持全球图像结构与当地细节兼容性,同时利用变异器的强大远程依赖性模型能力来保持语义认知信息的完整性。我们还提出了一个新的缩放公式,在计算期间有效地改进了OSG模型在图像超级分辨率任务上产生的图像质量。我们介绍了TAAN的最精准性图像制作能力。