Generative Adversarial Networks (GANs) have been very successful for synthesizing the images in a given dataset. The artificially generated images by GANs are very realistic. The GANs have shown potential usability in several computer vision applications, including image generation, image-to-image translation, video synthesis, and others. Conventionally, the generator network is the backbone of GANs, which generates the samples and the discriminator network is used to facilitate the training of the generator network. The discriminator network is usually a Convolutional Neural Network (CNN). Whereas, the generator network is usually either an Up-CNN for image generation or an Encoder-Decoder network for image-to-image translation. The convolution-based networks exploit the local relationship in a layer, which requires the deep networks to extract the abstract features. Hence, CNNs suffer to exploit the global relationship in the feature space. However, recently developed Transformer networks are able to exploit the global relationship at every layer. The Transformer networks have shown tremendous performance improvement for several problems in computer vision. Motivated from the success of Transformer networks and GANs, recent works have tried to exploit the Transformers in GAN framework for the image/video synthesis. This paper presents a comprehensive survey on the developments and advancements in GANs utilizing the Transformer networks for computer vision applications. The performance comparison for several applications on benchmark datasets is also performed and analyzed. The conducted survey will be very useful to deep learning and computer vision community to understand the research trends \& gaps related with Transformer-based GANs and to develop the advanced GAN architectures by exploiting the global and local relationships for different applications.
翻译:生成的Adversarial 网络(GANs)在将图像合成给定的数据集中非常成功。 GANs 人为生成的图像非常现实。 GANs 显示在几个计算机视觉应用程序中的潜在可用性, 包括图像生成、 图像到图像翻译、 视频合成等。 常规上, 发电机网络是GANs的主干, 生成样本, 并使用歧视者网络来促进发电机网络的培训。 歧视者网络通常是一个深层神经网络( CNN ) 。 而发电机网络通常不是用于图像生成的Up- CNN,就是用于图像到图像翻译的 Encoder-Decoder 网络。 革命型网络利用了一个层次的本地关系,这需要深层的网络来提取抽象特征。 因此,最近开发的变换器网络在利用全球关系到每个层次。 变换器网络将利用全球的变换器网络在计算机的图像应用中表现出巨大的性能改进。 变换式网络在计算机的图像应用中, 也尝试了 GAN 和GAN 成功 。 在GAND网络和GAN 图像的模型中, 在GRAVerferal 和GAND 上, 在GAND 上, 在GAND 和GAND 上, 在GAND 和GRAVLVLVLVADRDRDRDS 上尝试进进进进进 上, 在G 上, 在G 和GADRDRDRDRDRDRDRDRDRDRDRDRDRDFS 。