Diffusion models are a new class of generative models that mark a milestone in high-quality image generation while relying on solid probabilistic principles. This makes them promising candidate models for neural image compression. This paper outlines an end-to-end optimized framework based on a conditional diffusion model for image compression. Besides latent variables inherent to the diffusion process, the model introduces an additional per-instance "content" latent variable to condition the denoising process. Upon decoding, the diffusion process conditionally generates/reconstructs an image using ancestral sampling. Our experiments show that this approach outperforms one of the best-performing conventional image codecs (BPG) and one neural codec on two compression benchmarks, where we focus on rate-perception tradeoffs. Qualitatively, our approach shows fewer decompression artifacts than the classical approach.
翻译:传播模型是一种新型的基因模型,它标志着高质量图像生成的里程碑,同时依赖可靠的概率原则。 这使得它们有希望成为神经图像压缩的候选模型。 本文概述了基于有条件图像压缩扩散模型的端到端优化框架。 除了扩散过程所固有的潜在变量外, 模型还引入了额外的永久性“ 内容” 潜伏变量, 以决定去除过程。 解码后, 扩散过程有条件地生成/ 重建一个使用祖传取样的图像。 我们的实验显示, 这种方法在两种压缩基准上超过了最佳常规图像编码( BPG) 和一种神经编码, 我们在这两个基准上侧重于速感偏差取舍。 从本质上讲, 我们的方法显示的减压文物比经典方法要少。