In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach yields enhancements in the visual quality of the generated images, and considerably increases computational performance. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to higher quality images with more realistic high-frequency content. Furthermore, we verify that our model's latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved downstream visual quality.
翻译:近年来,Generation Adversarial Network(GANs)在视觉质量方面取得了相当大的进展,尽管如此,这些网络仍然由于光谱偏差结构以及同样不利的损失功能而导致高频内容质量下降,高频内容质量下降。为了解决这一问题,我们提出了一个新的通用样式和Wavelet GAN(SWAGAN)新颖的GAN(SWAGAN),在频率域内实施渐进生成。SWAGAN将波子融入其生成器和导体结构,在前进的每一步都强制实施频率认知潜在代表制。这一方法提高了所生成图像的视觉质量,并大大提高了计算性能。我们展示了我们的方法的优势,将它纳入SyleGAN2框架,并核实波盘域的内容生成能够带来更现实的高频内容的更高质量图像。此外,我们核实我们的模型潜在空间保留了使StyGAN作为大量编辑任务的基础的品质,并表明我们的频率认知方法还提高了下游视觉质量。