We propose pix2pix3D, a 3D-aware conditional generative model for controllable photorealistic image synthesis. Given a 2D label map, such as a segmentation or edge map, our model learns to synthesize a corresponding image from different viewpoints. To enable explicit 3D user control, we extend conditional generative models with neural radiance fields. Given widely-available monocular images and label map pairs, our model learns to assign a label to every 3D point in addition to color and density, which enables it to render the image and pixel-aligned label map simultaneously. Finally, we build an interactive system that allows users to edit the label map from any viewpoint and generate outputs accordingly.
翻译:我们提出了pix2pix3D,这是一种具有3D感知条件的生成模型,用于可控的逼真图像合成。给定 2D 标签映射,例如分割或边缘映射,我们的模型学习从不同视点合成相应的图像。为了实现明确的3D用户控制,我们将条件生成模型与神经辐射场相结合。给定广泛可用的单眼图像和标签映射对,我们的模型学习为每个3D点分配标签,以及颜色和密度,这使得它能够同时渲染图像和像素对齐的标签映射。最后,我们构建了一个交互式系统,使用户能够从任何视角编辑标签映射并生成相应的输出。