Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects. Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition followed by the incorporation of fine-details in the sketch. We introduce graph-aware transformer encoders that effectively capture global dynamic as well as local static structural relations among different body parts. To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder that explicitly models the variations of each sketch body part to be drawn. Experiments are performed on two creative sketch datasets: Creative Birds and Creative Creatures. Our qualitative, quantitative and human-based evaluations show that DoodleFormer outperforms the state-of-the-art on both datasets, yielding realistic and diverse creative sketches. On Creative Creatures, DoodleFormer achieves an absolute gain of 25 in terms of Fr`echet inception distance (FID) over the state-of-the-art. We also demonstrate the effectiveness of DoodleFormer for related applications of text to creative sketch generation and sketch completion.
翻译:创意素描或涂鸦是一种表达式活动, 以富有想象力和以前不为人知的方式描绘日常视觉物体。 创意素描图像的生成是一个富有挑战性的视觉问题, 任务在于生成多样但现实的创意素描, 拥有视觉世界天体的不可见的构成。 在这里, 我们提出一个新的粗略至粗略的两阶段框架, Doodleformer, 将创意素描生成问题分解为粗略的素描构成, 并在素描中加入细细节。 我们引入了图形- 华美变异变异变异变异变异器, 有效地捕捉到全球动态, 以及不同身体部分之间的固定结构关系。 为确保创作的创意素描多样性的多样性, 我们引入了一种概率粗略的素描图解变形图解变异。 在创性草图上, 我们的图像- 图像- 图像- 图像- 图像- 图像- 图像- 图像- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型- 模型-