This paper proposes a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional GAN. We employ the recurrent auto-encoder-based compression network as the generator, and most importantly, we propose a recurrent conditional discriminator, which judges on raw vs. compressed video conditioned on both spatial and temporal features, including the latent representation, temporal motion and hidden states in recurrent cells. This way, the adversarial training pushes the generated video to be not only spatially photo-realistic but also temporally consistent with the groundtruth and coherent among video frames. The experimental results show that the learned PLVC model compresses video with good perceptual quality at low bit-rate, and that it outperforms the official HEVC test model (HM 16.20) and the existing learned video compression approaches for several perceptual quality metrics and user studies. The codes will be released at the project page: https://github.com/RenYang-home/PLVC.
翻译:本文建议对经常有条件的GAN采用概念性学习视频压缩(PLVC)方法。 我们使用基于自动编码的经常性压缩网络作为生成器,最重要的是,我们提出一个经常性的有条件歧视,对原始视频和压缩视频进行评判时,视空间和时间特点而定,包括潜在代表、时间运动和经常细胞中的隐藏状态。这样,对抗性培训将生成的视频推向不仅空间上摄影现实主义,而且在时间上与视频框架的地面真实性和一致性一致。实验结果显示,所学的PLVC模型压缩机在低位率时具有良好的视觉质量,而且它优于HEVC官方测试模型(HM 16.20)和若干概念性质量指标和用户研究的现有学习性视频压缩方法。这些代码将在项目网页https://github.com/RenYang-home/PLVC上公布。