学习端至端损失图像压缩:基准 (Learning End-to-End Lossy Image Compression: A Benchmark)

from arxiv, Accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence. Website available at https://huzi96.github.io/compression-bench.html

Image compression is one of the most fundamental techniques and commonly used applications in the image and video processing field. Earlier methods built a well-designed pipeline, and efforts were made to improve all modules of the pipeline by handcrafted tuning. Later, tremendous contributions were made, especially when data-driven methods revitalized the domain with their excellent modeling capacities and flexibility in incorporating newly designed modules and constraints. Despite great progress, a systematic benchmark and comprehensive analysis of end-to-end learned image compression methods are lacking. In this paper, we first conduct a comprehensive literature survey of learned image compression methods. The literature is organized based on several aspects to jointly optimize the rate-distortion performance with a neural network, i.e., network architecture, entropy model and rate control. We describe milestones in cutting-edge learned image-compression methods, review a broad range of existing works, and provide insights into their historical development routes. With this survey, the main challenges of image compression methods are revealed, along with opportunities to address the related issues with recent advanced learning methods. This analysis provides an opportunity to take a further step towards higher-efficiency image compression. By introducing a coarse-to-fine hyperprior model for entropy estimation and signal reconstruction, we achieve improved rate-distortion performance, especially on high-resolution images. Extensive benchmark experiments demonstrate the superiority of our model in rate-distortion performance and time complexity on multi-core CPUs and GPUs. Our project website is available at https://huzi96.github.io/compression-bench.html.

翻译：图像压缩是最根本的技术之一,也是在图像和视频处理领域常用的应用手段之一。早期的方法造就了一个设计良好的管道,并努力通过手工调整改进管道的所有模块。后来,人们做出了巨大贡献,特别是当数据驱动的方法以其极好的模型化能力和灵活性振兴了域,纳入了新设计的模块和限制。尽管取得了巨大进展,但缺乏对端对端学习的图像压缩方法的系统基准和全面分析。在本文中,我们首先对学习过的图像压缩方法进行全面的文献调查。文献是根据几个方面组织起来的,目的是通过一个神经网络(即网络结构、通缩模型和节率控制),联合优化速度扭曲性能。后来,我们描述了尖端的图像压缩方法的阶段性能,审查了现有的模块化能力和灵活性。通过这项调查,揭示了图像压缩方法的主要挑战,同时提供了解决相关问题的机会,最近还采用了先进的图像压缩方法。这一分析提供了一个进一步向更高的图像压缩效率迈出一步的机会,即网络结构结构、网络结构结构、读写模型模型模型和控制。我们引入了先进的图像升级的升级模型,在高分辨率的实验室上,从而获得了我们的图像升级的升级的实验室,从而实现了我们的图像升级的升级的进度。在高分辨率上进行了我们的图像上进行。