Many data compressors regularly encode probability distributions for entropy coding - requiring minimal description length type of optimizations. Canonical prefix/Huffman coding usually just writes lengths of bit sequences, this way approximating probabilities with powers-of-2. Operating on more accurate probabilities usually allows for better compression ratios, and is possible e.g. using arithmetic coding and Asymmetric Numeral Systems family. Especially the multiplication-free tabled variant of the latter (tANS) builds automaton often replacing Huffman coding due to better compression at similar computational cost - e.g. in popular Facebook Zstandard and Apple LZFSE compressors. There is discussed encoding of probability distributions for such applications, especially using Pyramid Vector Quantizer(PVQ)-based approach with deformation, also tuned symbol spread for tANS.
翻译:许多数据压缩器定期编码对加密编码的概率分布 - 需要最短描述的优化类型。 Canonical priix/Huffman 编码通常只是写出比特序列的长度, 以这种方式接近2号功率的概率。 在更精确的概率情况下操作通常可以改善压缩比率, 并且有可能使用数学编码和亚称数字系统等方法。 特别是后者的免乘变式( tANS) 建立自动成像, 通常取代赫夫曼的编码, 原因是以类似的计算成本压缩更好---- 例如在流行的Facebook Zstand 和 苹果 LZFSE 压缩器中。 讨论过这些应用的概率分布的编码, 特别是使用基于子体的变形的矢量器( PVQ) 方法, 以及调控件的符号 。