Hash 表格的最佳时间/空间权衡 (On the Optimal Time/Space Tradeoff for Hash Tables) - 专知论文

会员服务 ·

0

哈希学习 · 优化器 · 可约的 · state-of-the-art · 类别 ·

2021 年 11 月 4 日

On the Optimal Time/Space Tradeoff for Hash Tables

翻译：Hash 表格的最佳时间/空间权衡

Michael A. Bender,Martín Farach-Colton,John Kuszmaul,William Kuszmaul,Mingmou Liu

from arxiv, 48 pages

For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are Theta(log n) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglog n) bits of space per key when compared to the information-theoretic optimum. Even prior to this bound being achieved, the target of O(loglog n) wasted bits per key was known to be a natural end goal, and was proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(loglog n) wasted bits per key is not the end of the line for hashing. In fact, for any k \in [log* n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and O(\log^{(k)} n) wasted bits per key (all with high probability in n). This means that, each time we increase insertion/deletion time by an \emph{additive constant}, we reduce the wasted bits per key \emph{exponentially}. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct.

翻译：近60年来, 散列表格研究中的核心未决问题一直是确定时间和空间之间最佳可实现的折算曲线。端点的散列表格提供了以下保证: 如果键/ 值为 Theta( log n) 位数, 那么在仅浪费 O( log n) 位数时, 可以实现恒定的插入/ 删除/ queries, 与信息理论最佳度相比, 每个键只浪费 O( log n) 位数。即使在实现这一交界之前, O( log n) 的折叠比值目标已知为自然结束目标, 并且证明对于一些密切相关的问题( 例如, 稳定的散列、动态检索和动态调整过滤器) 来说是最佳的。本文显示, 相对于信息理论的最佳行, 每个键( log n) 的浪费位数不是行的结尾。事实上, 任何 k( ) [log n] 中, 任何框架都有可能实现O( k) 时间插入/ deletletion, O(1) 时间查询, 而O (\\ k) 高级递增了每个键时间工具, 显示我们所设计的累变的顺序。

0

相关内容

哈希学习

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】金融数学概念和计算方法的导论，290页pdf

【干货书】金融数学概念和计算方法的导论，290页pdf

专知会员服务

65+阅读 · 2020年11月16日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

已删除

将门创投

5+阅读 · 2019年10月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

A simple coding-decoding algorithm for the Hamming code

Arxiv

0+阅读 · 2022年1月6日

Optimal Rate-Distortion-Leakage Tradeoff for Single-Server Information Retrieval

Arxiv

0+阅读 · 2022年1月6日

Balsa: Learning a Query Optimizer Without Expert Demonstrations

Arxiv

0+阅读 · 2022年1月5日

Check-based generation of one-time tables using qutrits

Arxiv

0+阅读 · 2022年1月3日

Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance

Arxiv

6+阅读 · 2021年8月2日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Arxiv

7+阅读 · 2020年12月15日

Dash: Scalable Hashing on Persistent Memory

Arxiv

6+阅读 · 2020年3月16日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】金融数学概念和计算方法的导论，290页pdf

【干货书】金融数学概念和计算方法的导论，290页pdf

专知会员服务

65+阅读 · 2020年11月16日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

自动驾驶轨迹规划中的基础模型：进展综述与开放挑战

《用于提升多域战备的大型语言模型辅助场景生成器》报告

【斯坦福博士论文】为人类使用优化 AI 模型

国防领域人工智能规模化应用的理论与实践

相关资讯

已删除

将门创投

5+阅读 · 2019年10月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

A simple coding-decoding algorithm for the Hamming code

Arxiv

0+阅读 · 2022年1月6日

Optimal Rate-Distortion-Leakage Tradeoff for Single-Server Information Retrieval

Arxiv

0+阅读 · 2022年1月6日

Balsa: Learning a Query Optimizer Without Expert Demonstrations

Arxiv

0+阅读 · 2022年1月5日

Check-based generation of one-time tables using qutrits

Arxiv

0+阅读 · 2022年1月3日

Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance

Arxiv

6+阅读 · 2021年8月2日

Optimizing Dense Retrieval Model Training with Hard Negatives

Arxiv

5+阅读 · 2021年4月16日

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Arxiv

7+阅读 · 2020年12月15日

Dash: Scalable Hashing on Persistent Memory

Arxiv

6+阅读 · 2020年3月16日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员