Today, according to the Cisco Annual Internet Report (2018-2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machine-to-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption. Another approach consists of developing completely new compression paradigms and architectures for machine-to-machine communications. In this paper, we focus on image compression and present an inference-time content-adaptive finetuning scheme that optimizes the latent representation of an end-to-end learned image codec, aimed at improving the compression efficiency for machine-consumption. The conducted experiments show that our online finetuning brings an average bitrate saving (BD-rate) of -3.66% with respect to our pretrained image codec. In particular, at low bitrate points, our proposed method results in a significant bitrate saving of -9.85%. Overall, our pretrained-and-then-finetuned system achieves -30.54% BD-rate over the state-of-the-art image/video codec Versatile Video Coding (VVC).
翻译:根据Cisco Internet年度报告(2018-2023),今天,根据Cisco Internet年度报告(2018-2023),互联网流量增长最快的类别是机器对机器的通信,特别是图像和视频的机器对机器的通信是一个新的挑战,在数据压缩方面打开了新的视角。一种可能的解决方案是调整当前针对人类的图像和视频编码标准,以适应机器消费的情况。另一种方法是为机器对机器的通信开发全新的压缩范式和结构。在本文中,我们侧重于图像压缩并展示一个推论时间的内容调整微调计划,优化终端对终端对终端学习的图像编码的潜在显示,目的是提高机器消费的压缩效率。进行的实验显示,我们的在线微调为我们预先培训的图像编码提供了3.66%的平均比特率储蓄(BD-raty) 。特别是在低位点,我们提出的方法导致大量比特率储蓄-9.85 %。总体而言,我们经过事先和当时经过培训的图像调制系统实现了30.CVRADR-DRAY-VAL-DRAY-VADR-VADRADR)