Recent image generation models show remarkable generation performance. However, they mirror strong location preference in datasets, which we call spatial bias. Therefore, generators render poor samples at unseen locations and scales. We argue that the generators rely on their implicit positional encoding to render spatial content. From our observations, the generator's implicit positional encoding is translation-variant, making the generator spatially biased. To address this issue, we propose injecting explicit positional encoding at each scale of the generator. By learning the spatially unbiased generator, we facilitate the robust use of generators in multiple tasks, such as GAN inversion, multi-scale generation, generation of arbitrary sizes and aspect ratios. Furthermore, we show that our method can also be applied to denoising diffusion probabilistic models.
翻译:最近的图像生成模型显示了惊人的生成性能。 但是,它们反映了在数据集中强烈的定位偏好,我们称之为空间偏差。 因此, 发电机在隐蔽的位置和比例上提供了较差的样本。 我们争论说, 发电机依靠隐含的位置编码来制造空间内容。 从我们的观察来看, 发电机隐含的位置编码是翻译的, 使得发电机在空间上有偏差。 为了解决这个问题, 我们提议在发电机的每个规模上注入明确的定位编码。 通过学习空间上不偏向的发电机, 我们为在多种任务中强力使用发电机提供了便利, 比如 GAN 转换、 多尺度生成、 生成任意大小和方位比率。 此外, 我们表明, 我们的方法也可以用于去掉扩散概率模型的偏差 。