By optimizing the rate-distortion-realism trade-off, generative compression approaches produce detailed, realistic images, even at low bit rates, instead of the blurry reconstructions produced by rate-distortion optimized models. However, previous methods do not explicitly control how much detail is synthesized, which results in a common criticism of these methods: users might be worried that a misleading reconstruction far from the input image is generated. In this work, we alleviate these concerns by training a decoder that can bridge the two regimes and navigate the distortion-realism trade-off. From a single compressed representation, the receiver can decide to either reconstruct a low mean squared error reconstruction that is close to the input, a realistic reconstruction with high perceptual quality, or anything in between. With our method, we set a new state-of-the-art in distortion-realism, pushing the frontier of achievable distortion-realism pairs, i.e., our method achieves better distortions at high realism and better realism at low distortion than ever before.
翻译:通过优化速率-失真-现实感之间的权衡,生成式压缩方法可以在低比特率下生成详细、逼真的图像,而不是失真的重构。然而,以往的方法并没有明确地控制合成的细节数量,这导致了这些方法的一个普遍批评:用户可能担心生成了远离输入图像的误导性重构。在这项研究中,我们通过训练一个解码器来缓解这些顾虑,这个解码器可以在两个领域之间进行转换,同时进行失真-现实感的权衡。从单一的压缩表示中,接收者可以决定重构一个低均方误差的与输入接近的重构,或者高感知质量的逼真重构,或者两者之间的任何东西。通过我们的方法,我们在失真-现实感方面取得了新的最优结果,推动了能够实现失真-现实感对的垂直。具体来说,我们的方法在更高的逼真度上取得更好的失真,而在更低的失真水平上取得更好的真实感,超过了以往的研究成果。