Recently, deep learning-based image compression has made signifcant progresses, and has achieved better ratedistortion (R-D) performance than the latest traditional method, H.266/VVC, in both subjective metric and the more challenging objective metric. However, a major problem is that many leading learned schemes cannot maintain a good trade-off between performance and complexity. In this paper, we propose an effcient and effective image coding framework, which achieves similar R-D performance with lower complexity than the state of the art. First, we develop an improved multi-scale residual block (MSRB) that can expand the receptive feld and is easier to obtain global information. It can further capture and reduce the spatial correlation of the latent representations. Second, a more advanced importance map network is introduced to adaptively allocate bits to different regions of the image. Third, we apply a 2D post-quantization flter (PQF) to reduce the quantization error, motivated by the Sample Adaptive Offset (SAO) flter in video coding. Moreover, We fnd that the complexity of encoder and decoder have different effects on image compression performance. Based on this observation, we design an asymmetric paradigm, in which the encoder employs three stages of MSRBs to improve the learning capacity, whereas the decoder only needs one stage of MSRB to yield satisfactory reconstruction, thereby reducing the decoding complexity without sacrifcing performance. Experimental results show that compared to the state-of-the-art method, the encoding and decoding time of the proposed method are about 17 times faster, and the R-D performance is only reduced by less than 1% on both Kodak and Tecnick datasets, which is still better than H.266/VVC(4:4:4) and other recent learning-based methods. Our source code is publicly available at https://github.com/fengyurenpingsheng.
翻译:最近,基于深层次学习的图像压缩取得了标志性的进展,并且比最新的传统方法H.266/VVC在主观度度和更具挑战性客观度两方面都取得了更好的速率扭曲性(R-D)性能。然而,一个主要问题是许多领先的学习计划无法在性能和复杂性之间保持良好的权衡。在本文件中,我们提出了一个高效和有效的图像编码框架,它取得了类似的R-D性能,其复杂性比工艺状态要低。首先,我们开发了一个改进的多级残余块(MSRB),可以扩大可接收的Feld,并且更容易获得全球信息。它可以进一步捕捉和减少潜在显示的图像显示的空间相关性。第三,我们采用了2D 后量化变压法(PQF) 来减少四分辨误差, 仅受制式调控(SAO) 和视频调试(SOO) 的驱动力。此外,Wnd的精度变缩的精度变精度和离子变精度观测过程的精度比我们目前的变精度变精度系统系统系统 的变精度和变精度变精度系统 。