在Hamming 度量中查找近距离对近的更快算法 (A Faster Algorithm for Finding Closest Pairs in Hamming Metric) - 专知论文

会员服务 ·

0

Performer · Pair · EUROCRYPT · 汉明距离 · binary ·

2021 年 2 月 4 日

A Faster Algorithm for Finding Closest Pairs in Hamming Metric

翻译：在Hamming 度量中查找近距离对近的更快算法

Andre Esser,Robert Kübler,Floyd Zweydinger

from arxiv, 22, 6 figures code: https://github.com/submission-nn/nn-algorithm

We study the Closest Pair Problem in Hamming metric, which asks to find the pair with the smallest Hamming distance in a collection of binary vectors. We give a new randomized algorithm for the problem on uniformly random input outperforming previous approaches whenever the dimension of input points is small compared to the dataset size. For moderate to large dimensions, our algorithm matches the time complexity of the previously best-known locality sensitive hashing based algorithms. Technically our algorithm follows similar design principles as Dubiner (IEEE Trans. Inf. Theory 2010) and May-Ozerov (Eurocrypt 2015). Besides improving the time complexity in the aforementioned areas, we significantly simplify the analysis of these previous works. We give a modular analysis, which allows us to investigate the performance of the algorithm also on non-uniform input distributions. Furthermore, we give a proof of concept implementation of our algorithm which performs well in comparison to a quadratic search baseline. This is the first step towards answering an open question raised by May and Ozerov regarding the practicability of algorithms following these design principles.

翻译：我们研究了哈明标准中最接近的对称问题,该标准要求找到在二进制矢量的集合中存在最小的哈明距离的对应方。我们给出了一个新的随机算法,在单一随机输入的问题上,当输入点的尺寸与数据集大小相比小时,该算法就优于先前最著名的地点敏感散射算法的时间复杂性时,我们的算法与中、大维相匹配。从技术上讲,我们的算法遵循与Dubiner(IEEE Trans. Inf. Theory. 2010)和May-Ozeerov(Europt 2015)类似的设计原则。除了提高上述领域的时间复杂性外,我们还大大简化了对先前这些工程的分析。我们给出了模块分析,使我们能够调查算法在非统一输入分布方面的性。此外,我们提供了我们算法与四进制搜索基线相比运行良好的概念执行证据。这是回答5月和Ozerov就遵循这些设计原则的算法的可行性提出的一个开放问题的第一步。

0

相关内容

Performer

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

计算机视觉最佳实践、代码示例和相关文档

计算机视觉最佳实践、代码示例和相关文档

专知会员服务

20+阅读 · 2019年10月9日

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

专知会员服务

18+阅读 · 2019年6月17日

度量学习中的pair-based loss

度量学习中的pair-based loss

极市平台

65+阅读 · 2019年7月17日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

已删除

架构文摘

3+阅读 · 2019年4月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Engineering Nearly Linear-Time Algorithms for Small Vertex Connectivity

Arxiv

0+阅读 · 2021年3月29日

Targeted Branching for the Maximum Independent Set Problem

Arxiv

0+阅读 · 2021年3月29日

Monte Carlo algorithm for the extrema of tempered stable processes

Arxiv

0+阅读 · 2021年3月29日

Good, Better, Best: Textual Distractors Generation for Multi-Choice VQA via Policy Gradient

Arxiv

1+阅读 · 2021年3月27日

A Doubly Regularized Linear Discriminant Analysis Classifier with Automatic Parameter Selection

Arxiv

0+阅读 · 2021年3月27日

The Risks of Invariant Risk Minimization

Arxiv

0+阅读 · 2021年3月27日

Projected Hamming Dissimilarity for Bit-Level Importance Coding in Collaborative Filtering

Arxiv

0+阅读 · 2021年3月26日

A PSO Strategy of Finding Relevant Web Documents using a New Similarity Measure

Arxiv

0+阅读 · 2021年3月26日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

计算机视觉最佳实践、代码示例和相关文档

计算机视觉最佳实践、代码示例和相关文档

专知会员服务

20+阅读 · 2019年10月9日

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

【CVPR 2019|workshop】视觉问答和对话，Visual Question Answering and Dialog，斯坦福大学|Christopher Manning，Google DeepMind|Karl Moritz Hermann

专知会员服务

18+阅读 · 2019年6月17日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

度量学习中的pair-based loss

度量学习中的pair-based loss

极市平台

65+阅读 · 2019年7月17日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

已删除

架构文摘

3+阅读 · 2019年4月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Engineering Nearly Linear-Time Algorithms for Small Vertex Connectivity

Arxiv

0+阅读 · 2021年3月29日

Targeted Branching for the Maximum Independent Set Problem

Arxiv

0+阅读 · 2021年3月29日

Monte Carlo algorithm for the extrema of tempered stable processes

Arxiv

0+阅读 · 2021年3月29日

Good, Better, Best: Textual Distractors Generation for Multi-Choice VQA via Policy Gradient

Arxiv

1+阅读 · 2021年3月27日

A Doubly Regularized Linear Discriminant Analysis Classifier with Automatic Parameter Selection

Arxiv

0+阅读 · 2021年3月27日

The Risks of Invariant Risk Minimization

Arxiv

0+阅读 · 2021年3月27日

Projected Hamming Dissimilarity for Bit-Level Importance Coding in Collaborative Filtering

Arxiv

0+阅读 · 2021年3月26日

A PSO Strategy of Finding Relevant Web Documents using a New Similarity Measure

Arxiv

0+阅读 · 2021年3月26日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

微信扫码咨询专知VIP会员