Given an abstract, deformed, ordinary sketch from untrained amateurs like you and me, this paper turns it into a photorealistic image - just like those shown in Fig. 1(a), all non-cherry-picked. We differ significantly from prior art in that we do not dictate an edgemap-like sketch to start with, but aim to work with abstract free-hand human sketches. In doing so, we essentially democratise the sketch-to-photo pipeline, "picturing" a sketch regardless of how good you sketch. Our contribution at the outset is a decoupled encoder-decoder training paradigm, where the decoder is a StyleGAN trained on photos only. This importantly ensures that generated results are always photorealistic. The rest is then all centred around how best to deal with the abstraction gap between sketch and photo. For that, we propose an autoregressive sketch mapper trained on sketch-photo pairs that maps a sketch to the StyleGAN latent space. We further introduce specific designs to tackle the abstract nature of human sketches, including a fine-grained discriminative loss on the back of a trained sketch-photo retrieval model, and a partial-aware sketch augmentation strategy. Finally, we showcase a few downstream tasks our generation model enables, amongst them is showing how fine-grained sketch-based image retrieval, a well-studied problem in the sketch community, can be reduced to an image (generated) to image retrieval task, surpassing state-of-the-arts. We put forward generated results in the supplementary for everyone to scrutinise.
翻译:本文主要研究如何从未经训练的业余人士笔下的抽象、畸形、普通草图中生成真实的照片,如图1(a)所示,所有的结果都是非精选的。与以往研究不同的是,我们不强制规定必须有一个类似边缘图的草图作为输入,而是想要处理抽象、手写的草图。这样,我们实现了草图到照片的转换,无论你的草图有多糟糕。本文的主要贡献是提出了一种解耦的编码器 - 解码器训练范式,其中解码器是由仅针对照片进行训练的StyleGAN构建的。这一点重要的是保证生成的结果始终是真实的。剩下的内容则关注于如何最好地处理草图和照片之间的抽象差距。为此,我们提出了一个自回归的草图映射器,该映射器是在草图和照片之间进行训练的,可以将草图映射到StyleGAN的潜空间。我们还引入了特定的设计来解决人类草图的抽象性,包括在训练过草图照片检索模型的基础上进行精细判别损失和局部感知草图增强策略。最后,我们展示了一些下游任务,其中包括展示如何将细粒度基于草图的图像检索(这是草图社区中研究最广泛的问题之一)简化为图像(生成)到图像检索任务,超越了最新技术水平。我们将生成的结果放在补充材料中供所有人审核。