This paper considers the problem of lossy neural image compression (NIC). Current state-of-the-art (sota) methods adopt uniform posterior to approximate quantization noise, and single-sample pathwise estimator to approximate the gradient of evidence lower bound (ELBO). In this paper, we propose to train NIC with multiple-sample importance weighted autoencoder (IWAE) target, which is tighter than ELBO and converges to log likelihood as sample size increases. First, we identify that the uniform posterior of NIC has special properties, which affect the variance and bias of pathwise and score function estimators of the IWAE target. Moreover, we provide insights on a commonly adopted trick in NIC from gradient variance perspective. Based on those analysis, we further propose multiple-sample NIC (MS-NIC), an enhanced IWAE target for NIC. Experimental results demonstrate that it improves sota NIC methods. Our MS-NIC is plug-and-play, and can be easily extended to other neural compression tasks.
翻译:本文考虑了神经机能图像压缩问题。 目前最先进的技术(sota)方法采用了统一的后部近似量噪声,以及单模路径估计器,以近似证据约束下(ELBO)的梯度。在本文中,我们提议对NIC进行多模份重要性加权自动电算器(IWAE)目标的培训,该目标比ELBO更为紧,随着样本规模的增加,也与日志的可能性趋同。首先,我们发现NIC的统一后部具有特殊性能,它影响IWAE目标路径和分数函数的偏差和偏差。此外,我们从梯度差异角度就通用的NIC诀窍提供了见解。根据这些分析,我们进一步提议为NIC提出一个强化的IWAE目标(MS-NIC)。实验结果表明,它改进了Sota NIC方法。我们的MS-NIC是插件,可以很容易扩展到其他神经压缩任务。