This survey reviews text-to-image diffusion models in the context that diffusion models have emerged to be popular for a wide range of generative tasks. As a self-contained work, this survey starts with a brief introduction of how a basic diffusion model works for image synthesis, followed by how condition or guidance improves learning. Based on that, we present a review of state-of-the-art methods on text-conditioned image synthesis, i.e., text-to-image. We further summarize applications beyond text-to-image generation: text-guided creative generation and text-guided image editing. Beyond the progress made so far, we discuss existing challenges and promising future directions.
翻译:本调查审查文本到图像的传播模型,其背景是,传播模型已经出现,为一系列广泛的基因任务提供了流行性。作为一项自成一体的工作,本调查首先简要介绍了基本传播模型如何有助于图像合成,然后是条件或指导如何改进学习。在此基础上,我们介绍关于文本附加图像合成的最新方法,即文本到图像。我们进一步总结了文本到图像生成以外的应用:文本引导创造性生成和文本指导图像编辑。除了迄今取得的进展外,我们讨论了现有的挑战和有希望的未来方向。</s>