神经视频压缩的混合空间-临时内容模型 (Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression)

For neural video codec, it is critical, yet challenging, to design an efficient entropy model which can accurately predict the probability distribution of the quantized latent representation. However, most existing video codecs directly use the ready-made entropy model from image codec to encode the residual or motion, and do not fully leverage the spatial-temporal characteristics in video. To this end, this paper proposes a powerful entropy model which efficiently captures both spatial and temporal dependencies. In particular, we introduce the latent prior which exploits the correlation among the latent representation to squeeze the temporal redundancy. Meanwhile, the dual spatial prior is proposed to reduce the spatial redundancy in a parallel-friendly manner. In addition, our entropy model is also versatile. Besides estimating the probability distribution, our entropy model also generates the quantization step at spatial-channel-wise. This content-adaptive quantization mechanism not only helps our codec achieve the smooth rate adjustment in single model but also improves the final rate-distortion performance by dynamic bit allocation. Experimental results show that, powered by the proposed entropy model, our neural codec can achieve 18.2% bitrate saving on UVG dataset when compared with H.266 (VTM) using the highest compression ratio configuration. It makes a new milestone in the development of neural video codec. The codes are at https://github.com/microsoft/DCVC.

翻译：对于神经视频编码来说,设计一个高效的英特罗比模型至关重要,但也具有挑战性,该模型可以准确预测量化潜值代表值的概率分布。然而,大多数现有视频编码器直接使用从图像编码到图像编码或运动的现成英特罗比模型来编码剩余或运动,并不充分利用视频中的空间时空特性。为此,本文件提议了一个强大的英特罗比模型,该模型可以有效地同时捕捉空间和时间依赖性。特别是,我们引入了利用潜在代表值之间的关联来挤压时间冗余的隐性模型。同时,还提议了双轨前空间编码器以平行友好的方式减少空间冗余。此外,我们的英特罗比模型也是多功能的。除了估计概率分布之外,我们的英特罗普模型还生成了在空间通道的四分立度步骤。这个内容适应性复位化机制不仅有助于我们的代码在单一模型中实现平稳的调整,而且还通过动态比分分配来改进最后的利率扭曲性性表现。实验结果显示,由拟议的英特罗特罗特-弗-英特罗特-英特-英特-调调调调制模型在18个模型中可以实现。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日