This paper presents an end-to-end learning-based video compression system, termed CANF-VC, based on conditional augmented normalizing flows (CANF). Most learned video compression systems adopt the same hybrid-based coding architecture as the traditional codecs. Recent research on conditional coding has shown the sub-optimality of the hybrid-based coding and opens up opportunities for deep generative models to take a key role in creating new coding frameworks. CANF-VC represents a new attempt that leverages the conditional ANF to learn a video generative model for conditional inter-frame coding. We choose ANF because it is a special type of generative model, which includes variational autoencoder as a special case and is able to achieve better expressiveness. CANF-VC also extends the idea of conditional coding to motion coding, forming a purely conditional coding framework. Extensive experimental results on commonly used datasets confirm the superiority of CANF-VC to the state-of-the-art methods. The source code of CANF-VC is available at https://github.com/NYCU-MAPL/CANF-VC.
翻译:本文介绍了基于端到端学习的基于学习的视频压缩系统,称为ANF-VC,以有条件增强的正常化流程为基础。大多数学习到的视频压缩系统采用了与传统编码器相同的混合编码结构。最近对有条件编码的研究表明,基于混合编码的次最佳性,为深度基因化模型创造机会,以在创建新的编码框架方面发挥关键作用。CANF-VC代表了一个新的尝试,利用有条件的ANF-VC学习一个用于有条件的跨框架编码的视频基因化模型。我们选择了ANF,因为它是一种特殊的基因化模型,其中包括变异自动编码器,作为特殊案例,能够实现更好的表达性。CANF-VC,还将有条件编码的概念扩大到移动编码,形成一个完全有条件的编码框架。关于常用数据集的广泛实验结果证实,CANF-VC优于最先进的方法。CANF-VC的源代码可在 https://github.com/NYCU-MAL/CANAVCANA/CANL。