Recent work has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate distortion theory. Motivated by this, we consider the problem of lossy image compression from the perspective of generative modeling. Starting from ResNet VAEs, which are originally designed for data (image) distribution modeling, we redesign their latent variable model using a quantization-aware posterior and prior, enabling easy quantization and entropy coding for image compression. Along with improved neural network blocks, we present a powerful and efficient class of lossy image coders, outperforming previous methods on natural image (lossy) compression. Our model compresses images in a coarse-to-fine fashion and supports parallel encoding and decoding, leading to fast execution on GPUs.
翻译:最近的工作显示,在变异自动读数器(VAEs)和率扭曲理论之间有着很强的理论联系。 以此为动力,我们从基因模型的角度来考虑丢失图像压缩问题。 从最初设计用于数据( 图像) 分布模型的ResNet VAEs开始,我们使用量化- 认知后后继和先前的原始变量模型重新设计其潜在变量模型,使图像压缩容易量化和加密编码。与改进的神经网络块一起,我们展示了一种强大而高效的丢失图像编码器,超过了先前的自然图像( 损) 压缩方法。 我们的模型压缩图像以粗略至精密的方式支持平行编码和解码,导致快速执行 GPU 。