Synthetic images created by image editing operations are prevalent, but the color or illumination inconsistency between the manipulated region and background may make it unrealistic. Thus, it is important yet challenging to localize the inharmonious region to improve the quality of synthetic image. Inspired by the classic clustering algorithm, we aim to group pixels into two clusters: inharmonious cluster and background cluster by inserting a novel Recurrent Self-Reasoning (RSR) module into the bottleneck of UNet structure. The mask output from RSR module is provided for the decoder as attention guidance. Finally, we adaptively combine the masks from RSR and the decoder to form our final mask. Experimental results on the image harmonization dataset demonstrate that our method achieves competitive performance both quantitatively and qualitatively.
翻译:图像编辑操作创造的合成图像很普遍,但被操纵区域和背景之间的颜色或照明不一致可能使它变得不现实。 因此,将不和谐区域本地化以提高合成图像质量很重要,但提高合成图像质量仍具有挑战性。 在典型的组合算法的启发下,我们的目标是将像素组合成两类:在UNet结构的瓶颈中插入一个新颖的经常性自读模块(RSR)和背景组。 RSR模块的遮罩输出作为解码器的引力。 最后,我们适应性地将RSR和解码器的面具组合成我们的最后面具。 图像统一数据集的实验结果显示,我们的方法在数量和质量上都具有竞争力。