Some forms of novel visual media enable the viewer to explore a 3D scene from arbitrary viewpoints, by interpolating between a discrete set of original views. Compared to 2D imagery, these types of applications require much larger amounts of storage space, which we seek to reduce. Existing approaches for compressing 3D scenes are based on a separation of compression and rendering: each of the original views is compressed using traditional 2D image formats; the receiver decompresses the views and then performs the rendering. We unify these steps by directly compressing an implicit representation of the scene, a function that maps spatial coordinates to a radiance vector field, which can then be queried to render arbitrary viewpoints. The function is implemented as a neural network and jointly trained for reconstruction as well as compressibility, in an end-to-end manner, with the use of an entropy penalty on the parameters. Our method significantly outperforms a state-of-the-art conventional approach for scene compression, achieving simultaneously higher quality reconstructions and lower bitrates. Furthermore, we show that the performance at lower bitrates can be improved by jointly representing multiple scenes using a soft form of parameter sharing.
翻译:某些新型视觉媒体形式使浏览者能够从任意的角度,通过在一组离散原始视图之间进行插图,从任意的角度探索一个 3D 场景。 与 2D 图像相比, 这些应用类型需要更大的存储空间, 我们试图缩小。 压缩 3D 场景的现有方法基于压缩和显示的分离: 每种原始视图都使用传统的 2D 图像格式压缩; 接收器淡化视图并随后进行显示。 我们通过直接压缩场景的隐含代表来统一这些步骤。 我们通过直接压缩场景, 将空间坐标映射到一个光亮矢量场, 从而可以进行查询, 从而产生任意的观点。 该功能可以作为一个神经网络来实施, 并且以端到端的方式共同培训用于重建以及可压缩性。 我们的方法大大超越了对场景压缩的常规状态方法, 同时实现更高质量的重建, 以及更低比特雅的功能。 此外, 我们显示, 低比特的性性表现可以通过使用软式的参数来联合代表多个图像来改进。