Diffusion probabilistic models have recently achieved remarkable success in generating high quality image and video data. In this work, we build on this class of generative models and introduce a method for lossy compression of high resolution images. The resulting codec, which we call DIffuson-based Residual Augmentation Codec (DIRAC), is the first neural codec to allow smooth traversal of the rate-distortion-perception tradeoff at test time, while obtaining competitive performance with GAN-based methods in perceptual quality. Furthermore, while sampling from diffusion probabilistic models is notoriously expensive, we show that in the compression setting the number of steps can be drastically reduced.
翻译:概率扩散模型最近在生成高质量图像和视频方面取得了显著的成功。在本文中,我们在这一类生成模型的基础上,引入了一种高分辨率图像的有损压缩方法。所得到的编解码器被称为基于扩散残差增强的编解码器(DIRAC),它是首个在测试时允许平滑遍历码率-失真-感知权衡的神经编解码器,同时在感知质量方面具有与基于GAN的方法相比竞争力的性能。此外,虽然从扩散概率模型中采样非常耗时,但我们展示了在压缩设置中可以大大减少步骤数。