Neural fields have emerged as a powerful paradigm for representing various signals, including videos. However, research on improving the parameter efficiency of neural fields is still in its early stages. Even though neural fields that map coordinates to colors can be used to encode video signals, this scheme does not exploit the spatial and temporal redundancy of video signals. Inspired by standard video compression algorithms, we propose a neural field architecture for representing and compressing videos that deliberately removes data redundancy through the use of motion information across video frames. Maintaining motion information, which is typically smoother and less complex than color signals, requires a far fewer number of parameters. Furthermore, reusing color values through motion information further improves the network parameter efficiency. In addition, we suggest using more than one reference frame for video frame reconstruction and separate networks, one for optical flows and the other for residuals. Experimental results have shown that the proposed method outperforms the baseline methods by a significant margin. The code is available in https://github.com/daniel03c1/eff_video_representation
翻译:神经场已成为代表各种信号(包括视频)的强大范例。然而,关于提高神经场参数效率的研究仍处于早期阶段。即使可以使用绘制颜色坐标的神经场对视频信号进行编码,但这一计划并没有利用视频信号的空间和时间冗余。受标准视频压缩算法的启发,我们提出了一个通过使用视频框架的移动信息来故意消除数据冗余的视频的神经场结构。保持运动信息(通常比彩色信号更平滑、更不复杂)需要少得多的参数。此外,通过运动信息重新使用彩色值进一步提高网络参数效率。此外,我们建议用不止一个参考框架来重建视频框架和独立网络,一个用于光学流,另一个用于剩余内容。实验结果显示,拟议的方法大大超越了基线方法。该代码可在 https://github.com/daniel031/eff_Vecual_presidence_presentation上查阅。