Despite recent advancements, the field of text-to-image synthesis still suffers from lack of fine-grained control. Using only text, it remains challenging to deal with issues such as concept coherence and concept contamination. We propose a method to enhance control by generating specific concepts that can be reused throughout multiple images, effectively expanding natural language with new words that can be combined much like a painter's palette. Unlike previous contributions, our method does not copy visuals from input data and can generate concepts through text alone. We perform a set of comparisons that finds our method to be a significant improvement over text-only prompts.
翻译:尽管最近取得了一些进步,但文本到图像合成领域仍然缺乏细微的控制。仅仅使用文本,处理概念一致性和概念污染等问题仍然具有挑战性。我们提出了一种方法,通过产生具体的概念来增强控制,这些概念可以在整个多图像中重新使用,有效地扩展自然语言,新词可以与画家的调色板相提并论。与以往的贡献不同,我们的方法并不从输入数据中复制图像,而只能通过文本产生概念。我们进行了一系列比较,发现我们的方法大大改进了只使用文本的提示。</s>