DreamBooth: 精美制图,用于对象驱动一代的文字到图像传播模型 (DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation)

Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt. However, these models lack the ability to mimic the appearance of subjects in a given reference set and synthesize novel renditions of them in different contexts. In this work, we present a new approach for "personalization" of text-to-image diffusion models (specializing them to users' needs). Given as input just a few images of a subject, we fine-tune a pretrained text-to-image model (Imagen, although our method is not limited to a specific model) such that it learns to bind a unique identifier with that specific subject. Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes. By leveraging the semantic prior embedded in the model with a new autogenous class-specific prior preservation loss, our technique enables synthesizing the subject in diverse scenes, poses, views, and lighting conditions that do not appear in the reference images. We apply our technique to several previously-unassailable tasks, including subject recontextualization, text-guided view synthesis, appearance modification, and artistic rendering (all while preserving the subject's key features). Project page: https://dreambooth.github.io/

翻译：大型文本到图像模型在AI的进化过程中取得了显著的飞跃, 使得对特定文本图像进行高质量和多样化的图像合成能够快速地进行高质量和多样的合成。但是, 这些模型缺乏在特定参考集中模仿主题外观并综合不同背景中这些主题的新翻版。在这项工作中, 我们提出了一个新的方法, 将文本到图像扩散模型“ 个性化” 的“ 个性化” (根据用户的需要专门设计这些模型) 。作为一个主题的几幅图像, 我们微调了一个预先训练过的文本到图像模型( Imagen, 尽管我们的方法并不局限于特定的模型 ), 以致于它学会将一个独特的标识与该特定主题捆绑起来。一旦该主题嵌入特定参考集到该模型的输出域中, 独有的标识就可以用来合成不同场景中的主题的全新图像的“ 个性化” 。通过利用模型中以前嵌入的语义, 以及一个新的自动的分类先前保存损失, 我们的技术能够将主题、配置、观点和照明条件结合到不同的场景中,, 包括前的图像的翻版图像。我们应用了我们的技术, 翻版的图像, 翻版的图像,, 翻版的图像, 翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版翻版

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/