Learned video compression methods have gained a variety of interest in the video coding community since they have matched or even exceeded the rate-distortion (RD) performance of traditional video codecs. However, many current learning-based methods are dedicated to utilizing short-range temporal information, thus limiting their performance. In this paper, we focus on exploiting the unique characteristics of video content and further exploring temporal information to enhance compression performance. Specifically, for long-range temporal information exploitation, we propose temporal prior that can update continuously within the group of pictures (GOP) during inference. In that case temporal prior contains valuable temporal information of all decoded images within the current GOP. As for short-range temporal information, we propose a progressive guided motion compensation to achieve robust and effective compensation. In detail, we design a hierarchical structure to achieve multi-scale compensation. More importantly, we use optical flow guidance to generate pixel offsets between feature maps at each scale, and the compensation results at each scale will be used to guide the following scale's compensation. Sufficient experimental results demonstrate that our method can obtain better RD performance than state-of-the-art video compression approaches. The code is publicly available on: https://github.com/Huairui/LSTVC.
翻译:视频压缩方法由于与传统视频编码器的速率扭曲性(RD)性能相匹配甚至甚至超过其,因此在视频编码社区中引起了各种兴趣。然而,目前许多以学习为基础的方法都致力于利用短期时间信息,从而限制其性能。在本文件中,我们侧重于利用视频内容的独特特点,进一步探索时间信息,以提高压缩性能。具体地说,为远程时间信息开发,我们提议在时间之前可在图像组(GOP)中不断更新。在这种情况下,先于时间含有当前GOP中所有解码图像的宝贵时间信息。关于短时间信息,我们建议采用渐进式的定向运动补偿,以实现稳健和有效的补偿。详细而言,我们设计了一种等级结构,以实现多尺度补偿。更重要的是,我们使用光学流程指导,在每个尺度的地貌地图之间产生像素抵消,每个尺度的补偿结果将用于指导以下尺度的补偿。充分的实验结果表明,我们的方法可以比当前GOP中的所有解码图像获得更好的RD性时间信息。对于短期时间信息来说,我们建议采用渐进式的引导性运动补偿,以获得稳健和有效补偿。我们可公开查阅的代码。 http://HGILSVALs/GI/GVA。