Although text-to-image diffusion models have made significant strides in generating images from text, they are sometimes more inclined to generate images like the data on which the model was trained rather than the provided text. This limitation has hindered their usage in both 2D and 3D applications. To address this problem, we explored the use of negative prompts but found that the current implementation fails to produce desired results, particularly when there is an overlap between the main and negative prompts. To overcome this issue, we propose Perp-Neg, a new algorithm that leverages the geometrical properties of the score space to address the shortcomings of the current negative prompts algorithm. Perp-Neg does not require any training or fine-tuning of the model. Moreover, we experimentally demonstrate that Perp-Neg provides greater flexibility in generating images by enabling users to edit out unwanted concepts from the initially generated images in 2D cases. Furthermore, to extend the application of Perp-Neg to 3D, we conducted a thorough exploration of how Perp-Neg can be used in 2D to condition the diffusion model to generate desired views, rather than being biased toward the canonical views. Finally, we applied our 2D intuition to integrate Perp-Neg with the state-of-the-art text-to-3D (DreamFusion) method, effectively addressing its Janus (multi-head) problem. Our project page is available at https://Perp-Neg.github.io/
翻译:虽然文本到图像扩散模型在从文本生成图像方面取得了重大进展,但它们有时更倾向于生成与模型训练数据类似的图像,而不是提供的文本。这种限制妨碍了它们在2D和3D应用中的使用。为了解决这个问题,我们探索了使用负面提示的方法,但发现当前实现无法产生预期的结果,特别是当主提示和负面提示之间存在重叠时。为了克服这个问题,我们提出了Perp-Neg,这是一种新的算法,利用了得分空间的几何属性来解决当前负面提示算法的缺点。Perp-Neg不需要对模型进行任何训练或微调。此外,我们通过实验证明,Perp-Neg提供了更大的灵活性,在2D情况下使用户能够从最初生成的图像中删掉不想要的概念。此外,为了将Perp-Neg的应用扩展到3D,我们彻底探索了如何将Perp-Neg用于2D中,以使扩散模型生成所需的视图,而不是偏向于规范视图。最后,我们将我们的2D直觉应用于将Perp-Neg与最先进的文本到3D(DreamFusion)方法集成,从而有效地解决了其Janus(多头)问题。我们的项目网页可在https://Perp-Neg.github.io/ 上找到。