It is nontrivial to store rapidly growing big data nowadays, which demands high-performance lossless compression techniques. Likelihood-based generative models have witnessed their success on lossless compression, where flow based models are desirable in allowing exact data likelihood optimisation with bijective mappings. However, common continuous flows are in contradiction with the discreteness of coding schemes, which requires either 1) imposing strict constraints on flow models that degrades the performance or 2) coding numerous bijective mapping errors which reduces the efficiency. In this paper, we investigate volume preserving flows for lossless compression and show that a bijective mapping without error is possible. We propose Numerical Invertible Volume Preserving Flow (iVPF) which is derived from the general volume preserving flows. By introducing novel computation algorithms on flow models, an exact bijective mapping is achieved without any numerical error. We also propose a lossless compression algorithm based on iVPF. Experiments on various datasets show that the algorithm based on iVPF achieves state-of-the-art compression ratio over lightweight compression algorithms.
翻译:存储快速增长的大数据是非边际的,它现在需要高性能的无损压缩技术。 类似基于基因的模型在无损压缩方面取得了成功, 以流动为基础的模型对于允许精确的数据概率优化使用双向绘图是可取的。 然而, 常见的连续流动与编码方法的离散性相矛盾, 要求 (1) 对降低性能的流量模型施加严格的限制, 或者 (2) 编码许多降低效率的双向绘图错误。 在本文中, 我们调查为无损压缩保存流量的数量, 并显示可以进行无误双向绘图。 我们提议从一般量保存流程中得出数字不可逆的量保护流程( iPF)。 通过对流量模型引入新的计算算法, 可以在不出现任何数字错误的情况下实现精确的双向映射。 我们还提议基于iVPF的无损压缩算法。 在各种数据集上进行的实验表明, 以iVPF为基础的算法实现了与轻重压缩算法相比, 最高级的压缩比率。