Recent studies have shown remarkable progress in GANs based on implicit neural representation (INR) - an MLP that produces an RGB value given its (x, y) coordinate. They represent an image as a continuous version of the underlying 2D signal instead of a 2D array of pixels, which opens new horizons for GAN applications (e.g., zero-shot super-resolution, image outpainting). However, training existing approaches require a heavy computational cost proportional to the image resolution, since they compute an MLP operation for every (x, y) coordinate. To alleviate this issue, we propose a multi-stage patch-based training, a novel and scalable approach that can train INR-based GANs with a flexible computational cost regardless of the image resolution. Specifically, our method allows to generate and discriminate by patch to learn the local details of the image and learn global structural information by a novel reconstruction loss to enable efficient GAN training. We conduct experiments on several benchmark datasets to demonstrate that our approach enhances baseline models in GPU memory while maintaining FIDs at a reasonable level.
翻译:最近的研究表明,基于隐性神经表现(INR)的GANs取得了显著的进展,即一个根据其(x, y) 坐标产生 RGB 值的 MLP。它们代表着一个图像,作为基础 2D 信号的连续版本,而不是一个2D 像素阵列,为 GAN 应用程序打开新的视野(例如,零发超分辨率,图像外涂)。然而,培训现有方法需要与图像分辨率成比例的重计算成本,因为它们为每个(x, y) 坐标计算 MLP 操作。为了缓解这一问题,我们提议了一个多阶段的补丁培训,一种新颖和可扩展的方法,可以以灵活的计算成本对基于IR 的GAN 进行训练,而不管图像分辨率的分辨率如何。具体地说,我们的方法允许通过补齐学习图像的当地细节和通过新的重建损失来学习全球结构信息,以便能够有效地进行GAN 培训。我们在几个基准数据集上进行实验,以证明我们的方法在合理水平上加强GPU记忆中的基线模型。