Neural image compression methods have seen increasingly strong performance in recent years. However, they suffer orders of magnitude higher computational complexity compared to traditional codecs, which stands in the way of real-world deployment. This paper takes a step forward in closing this gap in decoding complexity by adopting shallow or even linear decoding transforms. To compensate for the resulting drop in compression performance, we exploit the often asymmetrical computation budget between encoding and decoding, by adopting more powerful encoder networks and iterative encoding. We theoretically formalize the intuition behind, and our experimental results establish a new frontier in the trade-off between rate-distortion and decoding complexity for neural image compression. Specifically, we achieve rate-distortion performance competitive with the established mean-scale hyperprior architecture of Minnen et al. (2018), while reducing the overall decoding complexity by 80 %, or over 90 % for the synthesis transform alone. Our code can be found at https://github.com/mandt-lab/shallow-ntc.
翻译:----
神经图像压缩方法近年来取得了越来越强大的性能。但是和传统编解码器相比,它们的计算复杂度高出几个数量级,这阻碍了它们在实际应用中的推广。本文通过采用浅层或线性解码变换来缩小解码复杂度与传统编解码器之间的差距。为了弥补压缩性能的损失,我们利用编码和解码之间常常不对称的计算预算,采用更强大的编码网络和迭代编码。我们理论上阐述了背后的直觉,实验结果创造了神经图像压缩率失真与解码复杂度权衡的新领域。具体地,我们实现了与Minnen等人(2018年)建立的均值-缩放超先验结构相竞争的压缩率失真性能,同时将总体解码复杂度降低了80%,或仅考虑合成变换时降低了90%以上。我们的代码可以在https://github.com/mandt-lab/shallow-ntc上找到。