改进预处理前的运行长度编码 (Improving Run Length Encoding by Preprocessing) - 专知论文

会员服务 ·

0

Better · SimPLe · 分解的 · 粤港澳大湾区数字经济研究院 · 变换 ·

2021 年 1 月 13 日

Improving Run Length Encoding by Preprocessing

翻译：改进预处理前的运行长度编码

Sven Fiergolla,Petra Wolf

The Run Length Encoding (RLE) compression method is a long standing simple lossless compression scheme which is easy to implement and achieves a good compression on input data which contains repeating consecutive symbols. In its pure form RLE is not applicable on natural text or other input data with short sequences of identical symbols. We present a combination of preprocessing steps that turn arbitrary input data in a byte-wise encoding into a bit-string which is highly suitable for RLE compression. The main idea is to first read all most significant bits of the input byte-string, followed by the second most significant bit, and so on. We combine this approach by a dynamic byte remapping as well as a Burrows-Wheeler-Scott transform on a byte level. Finally, we apply a Huffman Encoding on the output of the bit-wise RLE encoding to allow for more dynamic lengths of code words encoding runs of the RLE. With our technique we can achieve a lossless average compression which is better than the standard RLE compression by a factor of 8 on average.

翻译：运行长度编码( RLE) 压缩方法是一个长期的简单且不丢失的压缩方法, 它很容易执行, 并且能够对含有重复连续符号的输入数据进行良好的压缩。纯格式的 RLE 不适用于自然文本或其他输入数据, 带有相同符号的短顺序。我们展示了一个组合的预处理步骤, 将任意输入数据在字节编码中转换成一个非常适合 RLE 压缩的位字符串。主要的想法是首先读取输入字节字符串中最重要的部分, 然后再读第二个最重要的部分等等。我们通过动态的字节重新绘图以及字节级的 Burrows- Wheeler- Scott 转换, 将这个方法结合起来。最后, 我们用一个 Huffman Ecoding 来对位式的 RLEE 编码输出进行输入, 以便让 RLE 的代码编码运行有更动态的长度。我们的技术可以实现一个不损失的平均压缩, 比标准 RLE 压缩的系数平均8 更好。

0

相关内容

Better

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知会员服务

78+阅读 · 2020年7月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【下载】Python自然语言处理实战书籍和代码《Natural Language Processing in Action》

【下载】Python自然语言处理实战书籍和代码《Natural Language Processing in Action》

专知会员服务

80+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

AINLP

14+阅读 · 2019年9月4日

已删除

将门创投

8+阅读 · 2019年6月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

Code Prediction by Feeding Trees to Transformers

Arxiv

0+阅读 · 2021年3月8日

On the Request-Trip-Vehicle Assignment Problem

Arxiv

0+阅读 · 2021年3月8日

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

Question Generation by Transformers

Question Generation by Transformers

Arxiv

5+阅读 · 2019年9月14日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

Improving Question Answering by Commonsense-Based Pre-Training

Arxiv

4+阅读 · 2019年3月1日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

Improving Deep Binary Embedding Networks by Order-aware Reweighting of Triplets

Arxiv

7+阅读 · 2018年4月17日

Improving Multiple Object Tracking with Optical Flow and Edge Preprocessing

Arxiv

10+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

粤港澳大湾区数字经济研究院

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知会员服务

78+阅读 · 2020年7月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【下载】Python自然语言处理实战书籍和代码《Natural Language Processing in Action》

【下载】Python自然语言处理实战书籍和代码《Natural Language Processing in Action》

专知会员服务

80+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

《人工智能与人类的未来》2025年最新300页书籍

《代码、指挥与冲突：描绘军事人工智能的未来》报告

相关资讯

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

AINLP

14+阅读 · 2019年9月4日

已删除

将门创投

8+阅读 · 2019年6月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

Code Prediction by Feeding Trees to Transformers

Arxiv

0+阅读 · 2021年3月8日

On the Request-Trip-Vehicle Assignment Problem

Arxiv

0+阅读 · 2021年3月8日

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

Question Generation by Transformers

Question Generation by Transformers

Arxiv

5+阅读 · 2019年9月14日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

Improving Question Answering by Commonsense-Based Pre-Training

Arxiv

4+阅读 · 2019年3月1日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

Improving Deep Binary Embedding Networks by Order-aware Reweighting of Triplets

Arxiv

7+阅读 · 2018年4月17日

Improving Multiple Object Tracking with Optical Flow and Edge Preprocessing

Arxiv

10+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员