GIRAFFFE:作为构成产生神经特征的场场代表场景 (GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields)

Deep generative models allow for photorealistic image synthesis at high resolutions. But for many applications, this is not enough: content creation also needs to be controllable. While several recent works investigate how to disentangle underlying factors of variation in the data, most of them operate in 2D and hence ignore that our world is three-dimensional. Further, only few works consider the compositional nature of scenes. Our key hypothesis is that incorporating a compositional 3D scene representation into the generative model leads to more controllable image synthesis. Representing scenes as compositional generative neural feature fields allows us to disentangle one or multiple objects from the background as well as individual objects' shapes and appearances while learning from unstructured and unposed image collections without any additional supervision. Combining this scene representation with a neural rendering pipeline yields a fast and realistic image synthesis model. As evidenced by our experiments, our model is able to disentangle individual objects and allows for translating and rotating them in the scene as well as changing the camera pose.

翻译：深基因模型允许高分辨率的光现实图像合成。但对于许多应用程序来说,这还不够:内容创建也需要控制。虽然最近的一些工作研究如何解开数据差异的基本因素, 但大部分在 2D 中运行, 从而忽略了我们的世界是三维的。此外, 很少有工作考虑场景的构成性质。我们的关键假设是将成份的 3D 场景表示纳入基因模型, 导致更可控的图像合成。作为成份性神经特征字段的场景, 允许我们从背景中解开一个或多个对象, 以及单个对象的形状和外观, 同时在没有任何额外监督的情况下从未结构化和未保存的图像收藏中学习。将这个场景代表与导线转换成一个快速和现实的图像合成模型。正如我们的实验所证明的, 我们的模型能够解开单个对象, 并允许在现场翻译和旋转它们, 以及改变摄像头的形状。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/