The recent success of text-to-image generation diffusion models has also revolutionized semantic image editing, enabling the manipulation of images based on query/target texts. Despite these advancements, a significant challenge lies in the potential introduction of prior bias in pre-trained models during image editing, e.g., making unexpected modifications to inappropriate regions. To this point, we present a novel Dual-Cycle Diffusion model that addresses the issue of prior bias by generating an unbiased mask as the guidance of image editing. The proposed model incorporates a Bias Elimination Cycle that consists of both a forward path and an inverted path, each featuring a Structural Consistency Cycle to ensure the preservation of image content during the editing process. The forward path utilizes the pre-trained model to produce the edited image, while the inverted path converts the result back to the source image. The unbiased mask is generated by comparing differences between the processed source image and the edited image to ensure that both conform to the same distribution. Our experiments demonstrate the effectiveness of the proposed method, as it significantly improves the D-CLIP score from 0.272 to 0.283. The code will be available at https://github.com/JohnDreamer/DualCycleDiffsion.
翻译:文本到图像生成扩散模型最近的成功也使语义图像编辑发生了革命性的变化,从而得以根据查询/目标文本对图像进行操纵。尽管取得了这些进步,但一个重大挑战在于,在图像编辑前培训模型中可能引入先前偏见,例如对不适当的区域进行意外的修改。至于这一点,我们提出了一个新的双元集成模型,通过生成一个不带偏见的面罩来解决先前的偏向问题,作为图像编辑的指导。提议的模型包含一个Bias 消除循环,它既包括一条前方路径,也包括一条反向路径,每条都有一个结构凝聚循环,以确保在编辑过程中保存图像内容。前方路径使用预先培训的模式制作编辑图像,而逆向路径则将结果转换回源图像。通过比较经处理的源图像与经编辑的图像之间的差异,产生一个公正的遮掩体,以确保两者符合同一分布。我们的实验表明拟议方法的有效性,因为它大大改进了D-CLIP的评分数从0.272到0.283/Drabriffar。代码将在 http://Cliffar/Dliffar/Dliffar.