Learned video compression has recently emerged as an essential research topic in developing advanced video compression technologies, where motion compensation is considered one of the most challenging issues. In this paper, we propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance caused by single-size deformable kernels in downsampled feature domain. More specifically, instead of utilizing optical flow warping or single-size-kernel deformable alignment, the proposed algorithm extracts features from the two adjacent frames to estimate content-adaptive heterogeneous deformable (HetDeform) kernel offsets. Then we transform the reference features with the HetDeform convolution to accomplish motion compensation. Moreover, we design a Spatial-Neighborhood-Conditioned Divisive Normalization (SNCDN) to achieve more effective data Gaussianization combined with the Generalized Divisive Normalization. Furthermore, we propose a multi-frame enhanced reconstruction module for exploiting context and temporal information for final quality enhancement. Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
翻译:在开发先进的视频压缩技术方面,最近出现了一个必要的研究课题,即运动补偿被认为是最具挑战性的问题之一。在本文件中,我们提出一个通过混杂的变形补偿战略(HDCVC)学习的视频压缩框架,以解决下层取样功能领域单尺寸变形内核造成的不稳定压缩性能问题。更具体地说,拟议算法不是利用光流扭曲或单尺寸内核变形,而是从两个相邻的框架提取特征,以估计内容适应性异变(HetDeform)内核偏移。然后我们通过异变变变变变的参考特征,以完成运动补偿。此外,我们设计了一个空间-近邻-可变异性分裂性正常化(SnCDN),以取得更有效的数据,将光流扭曲与普遍变异化结合起来。此外,我们提议了一个多框架强化的重建模块,以利用背景和时间信息来最终改进质量。实验结果表明,HDCVC取得了比最近所学的状态压缩性平面图像方法更高的性能。