Recent years have witnessed an increasing interest in end-to-end learned video compression. Most previous works explore temporal redundancy by detecting and compressing a motion map to warp the reference frame towards the target frame. Yet, it failed to adequately take advantage of the historical priors in the sequential reference frames. In this paper, we propose an Advanced Learned Video Compression (ALVC) approach with the in-loop frame prediction module, which is able to effectively predict the target frame from the previously compressed frames, without consuming any bit-rate. The predicted frame can serve as a better reference than the previously compressed frame, and therefore it benefits the compression performance. The proposed in-loop prediction module is a part of the end-to-end video compression and is jointly optimized in the whole framework. We propose the recurrent and the bi-directional in-loop prediction modules for compressing P-frames and B-frames, respectively. The experiments show the state-of-the-art performance of our ALVC approach in learned video compression. We also outperform the default hierarchical B mode of x265 in terms of PSNR and beat the slowest mode of the SSIM-tuned x265 on MS-SSIM. The project page: https://github.com/RenYang-home/ALVC.
翻译:近些年来,人们对端到端学习的视频压缩的兴趣日益浓厚。大多数前几部作品都通过探测和压缩移动图将参考框架压缩到目标框架上,探索时间冗余。然而,它未能充分利用顺序参照框架的历史前科。在本文件中,我们建议采用高级学习视频压缩(ALVC)方法,使用在环形框架预测模块,该模块能够有效地从先前的压缩框中预测目标框架,而不会消耗任何位速率。预测框架可以比先前压缩框架更好地作为参考,从而有利于压缩性能。拟议的在环形预测模块是端到端视频压缩的一部分,并且在整个框架中共同优化。我们提出了用于压缩P-框架和B-框架的经常性和双向内环形视频压缩(ALV)预测模块。实验展示了我们的ALVC方法在学习视频压缩方面的状态和艺术表现。我们还超越了PS-端到端图像压缩的默认等级B模式。在PS-端到端的视频压缩中,拟议中的图像压缩模块模块是:YSNRIS/RMS/MSBS/SA的慢点模式。