Relative entropy coding (REC) algorithms encode a sample from a target distribution $Q$ using a proposal distribution $P$, such that the expected codelength is $\mathcal{O}(D_{KL}[Q \,||\, P])$. REC can be seamlessly integrated with existing learned compression models since, unlike entropy coding, it does not assume discrete $Q$ or $P$, and does not require quantisation. However, general REC algorithms require an intractable $\Omega(e^{D_{KL}[Q \,||\, P]})$ runtime. We introduce AS* and AD* coding, two REC algorithms based on A* sampling. We prove that, for continuous distributions over $\mathbb{R}$, if the density ratio is unimodal, AS* has $\mathcal{O}(D_{\infty}[Q \,||\, P])$ expected runtime, where $D_{\infty}[Q \,||\, P]$ is the R\'enyi $\infty$-divergence. We provide experimental evidence that AD* also has $\mathcal{O}(D_{\infty}[Q \,||\, P])$ expected runtime. We prove that AS* and AD* achieve an expected codelength of $\mathcal{O}(D_{KL}[Q \,||\, P])$. Further, we introduce DAD*, an approximate algorithm based on AD* which retains its favourable runtime and has bias similar to that of alternative methods. Focusing on VAEs, we propose the IsoKL VAE (IKVAE), which can be used with DAD* to further improve compression efficiency. We evaluate A* coding with (IK)VAEs on MNIST, showing that it can losslessly compress images near the theoretically optimal limit.
翻译:相对英特罗比 Coding (REC) 算法将一个来自目标分布的样本编码为 Q 美元, 使用建议分配 $P$, 从而预期的代码长度为 $\ mathcal{O} (D ⁇ KL} Q\ Q\ \ \ \ \ \ \ \ \ \ \ \ \, P} 。 REC 可以和现有的已知压缩模型完全融合, 因为, 不同于英特罗比 coding, 它不承担离散的 Q 或 $P$, 不需要量化的 。 但是, 普通的REC 需要一个棘手的 $\ 美元 [Q, ⁇, P} (P} ) 美元运行时间。 我们的 Q\ 和 A* 运行的预期的方法是 。