This paper proposes a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network. In our approach, the recurrent auto-encoder-based generator learns to fully explore the temporal correlation for compressing video. More importantly, we propose a recurrent conditional discriminator, which judges raw and compressed video conditioned on both spatial and temporal information, including the latent representation, temporal motion and hidden states in recurrent cells. This way, in the adversarial training, it pushes the generated video to be not only spatially photo-realistic but also temporally consistent with groundtruth and coherent among video frames. The experimental results show that the proposed PLVC model learns to compress video towards good perceptual quality at low bit-rate, and outperforms the previous traditional and learned approaches on several perceptual quality metrics. The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches and the official HEVC test model (HM 16.20). The codes will be released at https://github.com/RenYang-home/PLVC.
翻译:本文建议一种概念性学习视频压缩(PLVC)方法,使用经常性的有条件对抗性网络。在我们的方法中,基于自动编码器的经常性生成器学会充分探索压缩视频的时间相关性。更重要的是,我们建议一种经常性的有条件区分器,对原始和压缩视频进行判断,以空间和时间信息为条件,包括潜在代表性、时间运动和经常细胞中的隐蔽状态。在对抗性培训中,它将生成的视频推向不仅在空间上符合摄影现实性,而且在时间上也符合地面真相和视频框架的一致性。实验结果显示,拟议的PLVC模型学会将视频压缩到低位速的好感质量,并超越了以往关于若干概念性质量指标的传统和学习方法。用户研究进一步验证了PLVC在与最新学的视频压缩方法和官方HEVC测试模型(HM 16.20)相比的杰出概念性表现。代码将在https://github.com/RenYang-home/PLVC上发布。