Generative models operate at fixed resolution, even though natural images come in a variety of sizes. As high-resolution details are downsampled away and low-resolution images are discarded altogether, precious supervision is lost. We argue that every pixel matters and create datasets with variable-size images, collected at their native resolutions. To take advantage of varied-size data, we introduce continuous-scale training, a process that samples patches at random scales to train a new generator with variable output resolutions. First, conditioning the generator on a target scale allows us to generate higher resolution images than previously possible, without adding layers to the model. Second, by conditioning on continuous coordinates, we can sample patches that still obey a consistent global layout, which also allows for scalable training at higher resolutions. Controlled FFHQ experiments show that our method can take advantage of multi-resolution training data better than discrete multi-scale approaches, achieving better FID scores and cleaner high-frequency details. We also train on other natural image domains including churches, mountains, and birds, and demonstrate arbitrary scale synthesis with both coherent global layouts and realistic local details, going beyond 2K resolution in our experiments. Our project page is available at: https://chail.github.io/anyres-gan/.
翻译:生成模型以固定分辨率运作, 即使自然图像以不同大小的形式出现。 当高分辨率细节被冲淡, 低分辨率图像被完全丢弃时, 我们失去了宝贵的监管。 我们辩称, 每一个像素问题和创建数据集时, 都使用以本地分辨率收集的可变大小图像。 为了利用不同大小的数据, 我们引入了连续规模的培训, 这个过程可以随机标定比例, 以便用可变输出分辨率对新生成者进行培训。 首先, 在目标规模上对生成器进行调整, 使我们能够生成比以前可能高的分辨率图像, 而不给模型添加层。 其次, 通过对连续坐标进行调整, 我们可以对仍然符合一致全球布局的补丁进行取样, 从而也可以对高分辨率进行可扩缩的培训。 受控制的FFHQ 实验表明, 我们的方法可以比离散的多尺度培训数据更好, 实现更好的FID分数和更清洁的高频度细节。 我们还在包括教堂、 山区和鸟类在内的其他自然图像领域进行培训, 并展示与一致的全球布局和现实的本地细节的任意规模合成, 超过 2K/ abrar 。 我们的实验中可以使用 http/ abrar 。