With the rapid development of the Metaverse, virtual humans have emerged, and human image synthesis and editing techniques, such as pose transfer, have recently become popular. Most of the existing techniques rely on GANs, which can generate good human images even with large variants and occlusions. But from our best knowledge, the existing state-of-the-art method still has the following problems: the first is that the rendering effect of the synthetic image is not realistic, such as poor rendering of some regions. And the second is that the training of GAN is unstable and slow to converge, such as model collapse. Based on the above two problems, we propose several methods to solve them. To improve the rendering effect, we use the Residual Fast Fourier Transform Block to replace the traditional Residual Block. Then, spectral normalization and Wasserstein distance are used to improve the speed and stability of GAN training. Experiments demonstrate that the methods we offer are effective at solving the problems listed above, and we get state-of-the-art scores in LPIPS and PSNR.
翻译:随着Metaverse的迅速发展,虚拟人类已经出现,人类图像合成和编辑技术,例如变形技术,最近变得很受欢迎。大多数现有技术都依赖GANs,这些技术即使使用大型变异和分解,也能产生良好的人类图像。但根据我们的最佳知识,现有最先进的方法仍然有下列问题:第一是合成图像的产生效果不现实,例如某些地区的变形很差。第二是GAN的训练不稳定和缓慢,例如模型崩溃。根据上述两个问题,我们提出几种方法来解决这些技术。为了提高生产效果,我们使用残余快速Fourier变异变形区来取代传统的残余区。然后,利用光谱正常化和瓦森斯坦距离来提高GAN培训的速度和稳定性。实验表明,我们提供的方法在解决上述问题方面是有效的,我们在LPIPS和PSNR中获得了最先进的分数。