Most existing image inpainting algorithms are based on a single view, struggling with large holes or the holes containing complicated scenes. Some reference-guided algorithms fill the hole by referring to another viewpoint image and use 2D image alignment. Due to the camera imaging process, simple 2D transformation is difficult to achieve a satisfactory result. In this paper, we propose 3DFill, a simple and efficient method for reference-guided image inpainting. Given a target image with arbitrary hole regions and a reference image from another viewpoint, the 3DFill first aligns the two images by a two-stage method: 3D projection + 2D transformation, which has better results than 2D image alignment. The 3D projection is an overall alignment between images and the 2D transformation is a local alignment focused on the hole region. The entire process of image alignment is self-supervised. We then fill the hole in the target image with the contents of the aligned image. Finally, we use a conditional generation network to refine the filled image to obtain the inpainting result. 3DFill achieves state-of-the-art performance on image inpainting across a variety of wide view shifts and has a faster inference speed than other inpainting models.
翻译:多数现有图像绘图算法都基于单一视图, 与大孔或包含复杂场景的孔进行挣扎。 一些参考引导算法通过引用另一个视图图像来填补洞口, 并使用 2D 图像对齐 。 由于相机成像程序, 简单的 2D 转换很难取得令人满意的结果 。 在本文中, 我们提议了 3DFill, 一种简单有效的参考制导图像对映的方法 。 鉴于一个目标图像带有任意孔区域, 从另一个角度显示一个参考图像, 3DFill 首先用两阶段方法将两张图像对齐 : 3D 投影+ 2D 转换, 其结果优于 2D 图像对齐 。 3D 投影是图像与 2D 转换的总体对齐, 以洞区域为焦点的本地对齐 。 整个图像对齐过程是自我监视的。 然后我们用匹配图像的内容来填补目标图像的洞口。 最后, 我们使用一个有条件的生成网络来改进填充图像, 以获得油漆结果 。 3DFill 3D 实现状态的图像对齐, 在图像的移动中, 速度变化中, 速度的进度比其他图像的进度速度变化速度超过各种。