In video compression, coding efficiency is improved by reusing pixels from previously decoded frames via motion and residual compensation. We define two levels of hierarchical redundancy in video frames: 1) first-order: redundancy in pixel space, i.e., similarities in pixel values across neighboring frames, which is effectively captured using motion and residual compensation, 2) second-order: redundancy in motion and residual maps due to smooth motion in natural videos. While most of the existing neural video coding literature addresses first-order redundancy, we tackle the problem of capturing second-order redundancy in neural video codecs via predictors. We introduce generic motion and residual predictors that learn to extrapolate from previously decoded data. These predictors are lightweight, and can be employed with most neural video codecs in order to improve their rate-distortion performance. Moreover, while RGB is the dominant colorspace in neural video coding literature, we introduce general modifications for neural video codecs to embrace the YUV420 colorspace and report YUV420 results. Our experiments show that using our predictors with a well-known neural video codec leads to 38% and 34% bitrate savings in RGB and YUV420 colorspaces measured on the UVG dataset.
翻译:在视频压缩中,通过通过运动和剩余补偿重新使用先前解码框架的像素可以提高编码效率。我们在视频框中定义了两个等级冗余等级:1)第一阶:像素空间冗余,即相邻框架的像素值相似,利用运动和剩余补偿有效捕捉到;2)第二阶:由于自然视频的平稳运动而使运动和剩余地图冗余。虽然大部分现有的神经视频编码文献处理第一阶冗余,但我们解决了通过预测器捕捉神经视频代码中第二阶冗余的问题。我们引入了通用运动和剩余预测器,从先前解码数据中推断出外推。这些预测器是轻量的,可以使用大多数神经视频代码来提高速度扭曲性能。此外,虽然RGB是神经视频编码文学中占主导地位的颜色空间,但我们对神经视频代码进行了总体修改,以接受YUV420色素空间中的第二级冗余,并报告了YUV420结果。我们的实验显示,使用我们的预测器将38号和38号测量到的图像节率数据,在YG节能中测量了38号。