深Zip:利用经常性神经网络进行无损失的数据压缩 (DeepZip: Lossless Data Compression using Recurrent Neural Networks)

Sequential data is being generated at an unprecedented pace in various forms, including text and genomic data. This creates the need for efficient compression mechanisms to enable better storage, transmission and processing of such data. To solve this problem, many of the existing compressors attempt to learn models for the data and perform prediction-based compression. Since neural networks are known as universal function approximators with the capability to learn arbitrarily complex mappings, and in practice show excellent performance in prediction tasks, we explore and devise methods to compress sequential data using neural network predictors. We combine recurrent neural network predictors with an arithmetic coder and losslessly compress a variety of synthetic, text and genomic datasets. The proposed compressor outperforms Gzip on the real datasets and achieves near-optimal compression for the synthetic datasets. The results also help understand why and where neural networks are good alternatives for traditional finite context models

翻译：正在以前所未有的速度以各种形式生成序列数据,包括文本和基因组数据。这就需要高效压缩机制,以便能够更好地储存、传输和处理这些数据。解决这个问题,许多现有的压缩机试图学习数据模型并进行基于预测的压缩。由于神经网络被称为通用功能近似器,有能力学习任意复杂的绘图,在实践中,在预测任务方面表现出色,我们探索和设计方法,利用神经网络预测器压缩连续数据。我们把经常性神经网络预测器与一个计算解码器结合起来,并且无损压缩各种合成、文本和基因组数据集。拟议的压缩器超越了真实数据集上的Gzip,并实现了合成数据集的近于最佳的压缩。结果也有助于理解为什么和在哪些方面神经网络是传统有限环境模型的良好替代品。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

86+阅读 · 2020年6月23日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日