Recent studies in lossy compression show that distortion and perceptual quality are at odds with each other, which put forward the tradeoff between distortion and perception (D-P). Intuitively, to attain different perceptual quality, different decoders have to be trained. In this paper, we present a nontrivial finding that only two decoders are sufficient for optimally achieving arbitrary (an infinite number of different) D-P tradeoff. We prove that arbitrary points of the D-P tradeoff bound can be achieved by a simple linear interpolation between the outputs of a minimum MSE decoder and a specifically constructed perfect perceptual decoder. Meanwhile, the perceptual quality (in terms of the squared Wasserstein-2 distance metric) can be quantitatively controlled by the interpolation factor. Furthermore, to construct a perfect perceptual decoder, we propose two theoretically optimal training frameworks. The new frameworks are different from the distortion-plus-adversarial loss based heuristic framework widely used in existing methods, which are not only theoretically optimal but also can yield state-of-the-art performance in practical perceptual decoding. Finally, we validate our theoretical finding and demonstrate the superiority of our frameworks via experiments. Code is available at: https://github.com/ZeyuYan/Controllable-Perceptual-Compression
翻译:最近对损失压缩的研究显示,扭曲和概念质量是相互矛盾的,这在扭曲和感知(D-P)之间形成了平衡。 直观地说,为了达到不同的概念质量,不同的解码器必须接受培训。 在本文中,我们提出一个非边际结论,即只有两个解码器足以最佳地达到任意性(数量极多的不同)D-P交易。我们证明,D-P交易的任意点可以通过在最低限度的MSE解码器和专门设计的完美概念解码器的输出之间进行简单的线性对接而实现。同时,概念质量(按方格瓦塞斯坦-2距离标准计算)可以定量地受内推因素控制。此外,为了构建完美的概念解码器,我们建议了两个理论上的最佳培训框架。新框架不同于现有方法中广泛使用的基于扭曲和对抗性压键性损失的框架,这些框架不仅在理论上是最佳的,而且可以在实际的奥氏理论- 理论- 测试框架中产生状态性表现。我们现有的理论- 理论- 理论- 测试: 理论- 理论- 理论- 检验- 理论- 检验- 检验- 理论- 检验- 检验- 。