Image blending aims to combine multiple images seamlessly. It remains challenging for existing 2D-based methods, especially when input images are misaligned due to differences in 3D camera poses and object shapes. To tackle these issues, we propose a 3D-aware blending method using generative Neural Radiance Fields (NeRF), including two key components: 3D-aware alignment and 3D-aware blending. For 3D-aware alignment, we first estimate the camera pose of the reference image with respect to generative NeRFs and then perform 3D local alignment for each part. To further leverage 3D information of the generative NeRF, we propose 3D-aware blending that directly blends images on the NeRF's latent representation space, rather than raw pixel space. Collectively, our method outperforms existing 2D baselines, as validated by extensive quantitative and qualitative evaluations with FFHQ and AFHQ-Cat.
翻译:图像混合旨在无缝地结合多个图像。 对于基于 2D 的现有方法, 特别是由于 3D 相机的外形和对象形状的差异而使输入图像错配时, 它仍然具有挑战性。 为了解决这些问题, 我们提议使用 3D 3D 图像混合法, 包括两个关键组成部分: 3D 显示对齐和 3D 显示对齐。 对于 3D 显示对齐, 我们首先估计 3D 显示在基因化 NERF 上的引用图像的相框构成, 然后对每个部分进行 3D 地方对齐。 为了进一步利用 3D 显示 NERF 的基因信息, 我们提议 3D 显示对齐 3D 显示对齐, 直接将图像混合在 NERF 的潜在代表空间, 而不是生像素空间。 集体地, 我们的方法超越了现有的 2D 基线, 由 FFHQ 和 AFHQ- Cat 的广泛定量和定性评价所验证。