This paper presents an end-to-end learning-based video compression system, termed CANF-VC, based on conditional augmented normalizing flows (ANF). Most learned video compression systems adopt the same hybrid-based coding architecture as the traditional codecs. Recent research on conditional coding has shown the sub-optimality of the hybrid-based coding and opens up opportunities for deep generative models to take a key role in creating new coding frameworks. CANF-VC represents a new attempt that leverages the conditional ANF to learn a video generative model for conditional inter-frame coding. We choose ANF because it is a special type of generative model, which includes variational autoencoder as a special case and is able to achieve better expressiveness. CANF-VC also extends the idea of conditional coding to motion coding, forming a purely conditional coding framework. Extensive experimental results on commonly used datasets confirm the superiority of CANF-VC to the state-of-the-art methods.
翻译:本文介绍了基于端到端学习的基于学习的视频压缩系统,称为CANF-VC,其基础是有条件增强的正常化流程。大多数学习的视频压缩系统采用了与传统编码器相同的基于混合的编码结构。最近对有条件编码的研究显示,基于混合的编码的次最佳性,为深层基因化模型创造机会,以在创建新的编码框架方面发挥关键作用。CANF-VC代表了利用有条件的ANF学习用于有条件的跨框架编码的视频基因化模型的新尝试。我们选择了ANF, 因为它是一种特殊的基因化模型类型,包括变异自动编码器,作为特殊案例,能够实现更好的表达性。CANF-VC还扩展了有条件编码编码的想法,形成了一个完全有条件的编码框架。关于常用数据集的广泛实验结果证实了CANF-VC在最新方法上的优越性。