We introduce a video compression algorithm based on instance-adaptive learning. On each video sequence to be transmitted, we finetune a pretrained compression model. The optimal parameters are transmitted to the receiver along with the latent code. By entropy-coding the parameter updates under a suitable mixture model prior, we ensure that the network parameters can be encoded efficiently. This instance-adaptive compression algorithm is agnostic about the choice of base model and has the potential to improve any neural video codec. On UVG, HEVC, and Xiph datasets, our codec improves the performance of a low-latency scale-space flow model by between 21% and 26% BD-rate savings, and that of a state-of-the-art B-frame model by 17 to 20% BD-rate savings. We also demonstrate that instance-adaptive finetuning improves the robustness to domain shift. Finally, our approach reduces the capacity requirements on compression models. We show that it enables a state-of-the-art performance even after reducing the network size by 72%.
翻译:我们引入了基于实例适应学习的视频压缩算法。 在将要传输的每个视频序列中, 我们微调一个预先训练的压缩模型。 最佳参数会随潜值一起传送给接收者。 通过在先前合适的混合模型下对参数更新进行加密编码, 我们确保网络参数能够有效编码。 这个实例适应压缩算法对基准模型的选择具有不可知性, 并有可能改进任何神经视频编码。 在UVG、 HEVC 和 Xif 数据集上, 我们的编码使低延迟空间空间流模型的性能提高了21%至26% BD 节率的性能, 以及一个最先进的B- 框架模型的性能提高了17至20 % BD 节率的节能。 我们还证明, 实例适应性微调可以提高域转换的稳健性。 最后, 我们的方法降低了压缩模型的能力要求。 我们显示, 即使在网络规模缩小72%之后, 我们的节能性性性性能也提高了。