压缩索引用于连续出现次数 (Compressed Indexing for Consecutive Occurrences) - 专知论文

会员服务 ·

0

压缩索引 · 索引结构 · 空间复杂度 · 结构 · 间隙 ·

2023 年 4 月 3 日

Compressed Indexing for Consecutive Occurrences

翻译：压缩索引用于连续出现次数

Paweł Gawrychowski,Garance Gourdel,Tatiana Starikovskaya,Teresa Anna Steiner

from arxiv, This is a full version of a paper accepted to CPM 2023

The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact occurrence of a given pattern $P$. However, practical applications motivate the necessity of considering more complex queries, for example concerning near occurrences of two patterns. Recently, Bille et al. [CPM 2021] introduced a variant of such queries, called gapped consecutive occurrences, in which a query consists of two patterns $P_{1}$ and $P_{2}$ and a range $[a,b]$, and one must find all consecutive occurrences $(q_1,q_2)$ of $P_{1}$ and $P_{2}$ such that $q_2-q_1 \in [a,b]$. By their results, we cannot hope for a very efficient indexing structure for such queries, even if $a=0$ is fixed (although at the same time they provided a non-trivial upper bound). Motivated by this, we focus on a text given as a straight-line program (SLP) and design an index taking space polynomial in the size of the grammar that answers such queries in time optimal up to polylog factors.

翻译：摘要：算法在字符串中的基本问题是索引，即针对特定查询预处理给定的字符串。到目前为止，当查询要求给定模式 $P$ 的确切出现时，我们已经有了一些有效的解决方案。然而，实际应用中，考虑更复杂的查询是必要的，例如涉及两种模式的近似出现情况。最近，Bille等人 [CPM 2021] 引入了这种查询的变体，称为间隙连续出现，其中查询由两个模式 $P_{1}$ 和 $P_{2}$ 和一个范围 $[a,b]$ 组成，必须找到所有满足 $q_2-q_1 \in [a,b]$ 的 $P_{1}$ 和 $P_{2}$ 的连续出现 $(q_1,q_2)$。根据他们的结果，即使 $a=0$ 固定不变，我们也不能指望对这种查询非常有效的索引结构（尽管同时他们提供了一个非平凡的上界）。受此启发，我们专注于以直线程序（SLP）形式给出的文本，并设计了一个索引结构，其空间复杂度为语法大小的多项式，以最优的时间复杂度回答这种查询，直到多项式对数因子。

0

相关内容

压缩索引

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【数据科学导论书】Introduction to Datascience，253页pdf

【数据科学导论书】Introduction to Datascience，253页pdf

专知会员服务

50+阅读 · 2021年11月15日

【深度学习中的隐式正则化】从矩阵和张量分解中得到的教训，141页ppt

【深度学习中的隐式正则化】从矩阵和张量分解中得到的教训，141页ppt

专知会员服务

58+阅读 · 2021年4月5日

最新《图神经网络知识图谱补全》综述论文

最新《图神经网络知识图谱补全》综述论文

专知会员服务

157+阅读 · 2020年7月29日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

13+阅读 · 2019年4月17日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

M-矩阵（张量）最小特征值估计及其相关问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

色散光反馈半导体激光器产生无周期混沌激光

国家自然科学基金

0+阅读 · 2014年12月31日

几类优化问题的填充函数算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

有限频域内线性系统的模型降阶与模型补偿

国家自然科学基金

0+阅读 · 2013年12月31日

高阶非协调有限元的构造、收敛性分析及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

非线性Cahn-Hilliard型方程自适应高阶稳定数值方法分析

国家自然科学基金

0+阅读 · 2013年12月31日

Yb3+掺杂氟磷酸盐玻璃单频激光光纤的研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

非线性不连续系统的稳定与镇定

国家自然科学基金

0+阅读 · 2008年12月31日

Empirical Challenge for NC Theory

Arxiv

0+阅读 · 2023年5月25日

A Diagnosis Algorithms for a Rotary Indexing Machine

Arxiv

0+阅读 · 2023年5月25日

A Fast Algorithm for Consistency Checking Partially Ordered Time

Arxiv

0+阅读 · 2023年5月25日

Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints

Arxiv

0+阅读 · 2023年5月24日

Linearization Errors in Discrete Goal-Oriented Error Estimation

Arxiv

0+阅读 · 2023年5月24日

Decoder Tuning: Efficient Language Understanding as Decoding

Arxiv

0+阅读 · 2023年5月24日

Private Statistical Estimation of Many Quantiles

Arxiv

0+阅读 · 2023年5月23日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Memory-Gated Recurrent Networks

Memory-Gated Recurrent Networks

Arxiv

12+阅读 · 2020年12月24日

Do RNN and LSTM have Long Memory?

Do RNN and LSTM have Long Memory?

Arxiv

19+阅读 · 2020年6月10日

VIP会员

文章信息

相关主题

空间复杂度

相关VIP内容

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【数据科学导论书】Introduction to Datascience，253页pdf

【数据科学导论书】Introduction to Datascience，253页pdf

专知会员服务

50+阅读 · 2021年11月15日

【深度学习中的隐式正则化】从矩阵和张量分解中得到的教训，141页ppt

【深度学习中的隐式正则化】从矩阵和张量分解中得到的教训，141页ppt

专知会员服务

58+阅读 · 2021年4月5日

最新《图神经网络知识图谱补全》综述论文

最新《图神经网络知识图谱补全》综述论文

专知会员服务

157+阅读 · 2020年7月29日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

13+阅读 · 2019年4月17日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Empirical Challenge for NC Theory

Arxiv

0+阅读 · 2023年5月25日

A Diagnosis Algorithms for a Rotary Indexing Machine

Arxiv

0+阅读 · 2023年5月25日

A Fast Algorithm for Consistency Checking Partially Ordered Time

Arxiv

0+阅读 · 2023年5月25日

Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints

Arxiv

0+阅读 · 2023年5月24日

Linearization Errors in Discrete Goal-Oriented Error Estimation

Arxiv

0+阅读 · 2023年5月24日

Decoder Tuning: Efficient Language Understanding as Decoding

Arxiv

0+阅读 · 2023年5月24日

Private Statistical Estimation of Many Quantiles

Arxiv

0+阅读 · 2023年5月23日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Memory-Gated Recurrent Networks

Memory-Gated Recurrent Networks

Arxiv

12+阅读 · 2020年12月24日

Do RNN and LSTM have Long Memory?

Do RNN and LSTM have Long Memory?

Arxiv

19+阅读 · 2020年6月10日

相关基金

M-矩阵（张量）最小特征值估计及其相关问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

色散光反馈半导体激光器产生无周期混沌激光

国家自然科学基金

0+阅读 · 2014年12月31日

几类优化问题的填充函数算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

有限频域内线性系统的模型降阶与模型补偿

国家自然科学基金

0+阅读 · 2013年12月31日

高阶非协调有限元的构造、收敛性分析及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

非线性Cahn-Hilliard型方程自适应高阶稳定数值方法分析

国家自然科学基金

0+阅读 · 2013年12月31日

Yb3+掺杂氟磷酸盐玻璃单频激光光纤的研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机微分方程的逼近

国家自然科学基金

0+阅读 · 2009年12月31日

非线性不连续系统的稳定与镇定

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员