We present Panoptic Neural Fields (PNF), an object-aware neural scene representation that decomposes a scene into a set of objects (things) and background (stuff). Each object is represented by an oriented 3D bounding box and a multi-layer perceptron (MLP) that takes position, direction, and time and outputs density and radiance. The background stuff is represented by a similar MLP that additionally outputs semantic labels. Each object MLPs are instance-specific and thus can be smaller and faster than previous object-aware approaches, while still leveraging category-specific priors incorporated via meta-learned initialization. Our model builds a panoptic radiance field representation of any scene from just color images. We use off-the-shelf algorithms to predict camera poses, object tracks, and 2D image semantic segmentations. Then we jointly optimize the MLP weights and bounding box parameters using analysis-by-synthesis with self-supervision from color images and pseudo-supervision from predicted semantic segmentations. During experiments with real-world dynamic scenes, we find that our model can be used effectively for several tasks like novel view synthesis, 2D panoptic segmentation, 3D scene editing, and multiview depth prediction.
翻译:我们展示了全光神经场( PNF ), 是一个能让一个场景分解成一组对象( 东西) 和背景( 部分) 的天体神经场景显示。 每个对象都代表着一个面向方向的 3D 边框和一个多层透视器( MLP ), 它代表着位置、 方向、 时间和输出密度以及光度。 背景材料代表着一个类似的 MLP, 它能额外输出语义标签。 每个对象 MLP 是实例性, 因此可以比先前的天体观察方法更小和更快, 同时仍然通过元化初始化来利用特定类别的前科。 我们模型代表着一个全光的 3D 边框框和多层光谱( MLP ) 。 我们使用现成的算法来预测相机的姿势、 、 对象轨迹和 2D 图像的语义区段。 然后我们共同优化 MLP 的重量和框框参数, 使用通过彩色图像的自我检查和预测的假视来进行自我检查, 。 在与预测的深度的实验中, 我们发现像化的模型中, 3 被有效利用的图像的模型中, 。