计算连通查询的差异效率 (Computing the Difference of Conjunctive Queries Efficiently) - 专知论文

会员服务 ·

0

结构特性 · 启发式方法 · 启发式 · 数据库系统 · 算法 ·

2023 年 4 月 20 日

Computing the Difference of Conjunctive Queries Efficiently

翻译：计算连通查询的差异效率

Xiao Hu,Qichen Wang

We investigate how to efficiently compute the difference result of two (or multiple) conjunctive queries, which is the last operator in relational algebra to be unraveled. The standard approach in practical database systems is to materialize the results for every input query as a separate set, and then compute the difference of two (or multiple) sets. This approach is bottlenecked by the complexity of evaluating every input query individually, which could be very expensive, particularly when there are only a few results in the difference. In this paper, we introduce a new approach by exploiting the structural property of input queries and rewriting the original query by pushing the difference operator down as much as possible. We show that for a large class of difference queries, this approach can lead to a linear-time algorithm, in terms of the input size and (final) output size, i.e., the number of query results that survive from the difference operator. We complete this result by showing the hardness of computing the remaining difference queries in linear time. Although a linear-time algorithm is hard to achieve in general, we also provide some heuristics that can provably improve the standard approach. At last, we compare our approach with standard SQL engines over graph and benchmark datasets. The experiment results demonstrate order-of-magnitude speedups achieved by our approach over the vanilla SQL.

翻译：我们研究如何高效地计算两个（或多个）连通查询的差异结果，这是关系代数中最后一个被揭示的运算符。实际数据库系统中的标准方法是将每个输入查询的结果分别材料化为单独的集合，然后计算两个（或多个）集合的差异。这种方法受限于单独评估每个输入查询的复杂性，特别是当差异中仅有少量结果时，代价可能非常高。在本文中，我们通过利用输入查询的结构特性，并通过尽可能地将差异运算符推向下方重写原始查询来介绍一种新的方法。我们展示了对于一类大型差异查询，这种方法可以导致一个线性时间算法，以输入大小和（最终）输出大小为基础，即从差异运算符中幸存下来的查询结果数量。我们通过展示在线性时间内计算剩余差异查询的难度来完成这个结果。虽然一般情况下很难实现线性时间算法，但我们还是提供了一些启发式方法，可以有效改进标准方法。最后，我们将我们的方法与图形和基准数据集上的标准 SQL 引擎进行了比较。实验结果证明，我们的方法能够实现数量级的加速比。

0

相关内容

结构特性

【ETH博士论文】设计高效的深度神经网络：拓扑优化、量化和多任务学习，151页pdf

【ETH博士论文】设计高效的深度神经网络：拓扑优化、量化和多任务学习，151页pdf

专知会员服务

54+阅读 · 2023年5月30日

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

专知会员服务

13+阅读 · 2022年3月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

专知会员服务

63+阅读 · 2020年7月12日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

专知会员服务

46+阅读 · 2020年5月11日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

专知会员服务

45+阅读 · 2020年2月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

论文浅尝 | Neural-Symbolic Models for Logical Queries on KG

论文浅尝 | Neural-Symbolic Models for Logical Queries on KG

开放知识图谱

0+阅读 · 2022年10月31日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

注意力机制 | 图卷积多跳注意力机制 | Direct multi-hop Attention based GNN

注意力机制 | 图卷积多跳注意力机制 | Direct multi-hop Attention based GNN

AINLP

22+阅读 · 2020年11月29日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】RoomNet：端到端房屋布局估计

【泡泡一分钟】RoomNet：端到端房屋布局估计

泡泡机器人SLAM

18+阅读 · 2018年12月4日

基于网络解析的低压切负荷在线决策方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

云计算框架下大规模科学计算安全外包协议研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于频繁更新的大图数据查询和管理技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于关联和连锁分析的茄子耐冷基因发掘

国家自然科学基金

0+阅读 · 2013年12月31日

Cache访问优化的空间数据库查询处理技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

属性匹配在多源空间数据融合中的应用研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于树型自动机的数据库安全理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

计算几何中曲线曲面插值的研究与应用

国家自然科学基金

0+阅读 · 2008年12月31日

Efficient Evaluation of Arbitrary Relational Calculus Queries

Arxiv

1+阅读 · 2023年6月5日

A Plaque Test for Redundancies in Relational Data

Arxiv

0+阅读 · 2023年6月5日

EEL: Efficiently Encoding Lattices for Reranking

Arxiv

0+阅读 · 2023年6月1日

BitE : Accelerating Learned Query Optimization in a Mixed-Workload Environment

Arxiv

0+阅读 · 2023年6月1日

Efficient algorithms for certifying lower bounds on the discrepancy of random matrices

Arxiv

0+阅读 · 2023年6月1日

Efficient Bi-Level Optimization for Recommendation Denoising

Arxiv

0+阅读 · 2023年6月1日

Representation Learning on Hyper-Relational and Numeric Knowledge Graphs with Transformers

Arxiv

0+阅读 · 2023年6月1日

Optimal Sampling-based Motion Planning in Gaussian Belief Space for Minimum Sensing Navigation

Arxiv

0+阅读 · 2023年6月1日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

VIP会员

文章信息

相关主题

启发式方法

数据库系统

相关VIP内容

【ETH博士论文】设计高效的深度神经网络：拓扑优化、量化和多任务学习，151页pdf

【ETH博士论文】设计高效的深度神经网络：拓扑优化、量化和多任务学习，151页pdf

专知会员服务

54+阅读 · 2023年5月30日

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

专知会员服务

13+阅读 · 2022年3月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

专知会员服务

63+阅读 · 2020年7月12日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

专知会员服务

46+阅读 · 2020年5月11日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

大型知识图谱检索算法的优化，19页pdf，Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

专知会员服务

45+阅读 · 2020年2月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】迈向具有高维结果的可靠且稳健的因果推断

《美海军分布式海上作战（DMO）概念：最新情况》

Gemini 2.5：推动前沿，具备先进推理、多模态、长上下文及下一代智能体能力

【ICML2025教程】联想记忆的现代方法

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

论文浅尝 | Neural-Symbolic Models for Logical Queries on KG

论文浅尝 | Neural-Symbolic Models for Logical Queries on KG

开放知识图谱

0+阅读 · 2022年10月31日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

注意力机制 | 图卷积多跳注意力机制 | Direct multi-hop Attention based GNN

注意力机制 | 图卷积多跳注意力机制 | Direct multi-hop Attention based GNN

AINLP

22+阅读 · 2020年11月29日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】RoomNet：端到端房屋布局估计

【泡泡一分钟】RoomNet：端到端房屋布局估计

泡泡机器人SLAM

18+阅读 · 2018年12月4日

相关论文

Efficient Evaluation of Arbitrary Relational Calculus Queries

Arxiv

1+阅读 · 2023年6月5日

A Plaque Test for Redundancies in Relational Data

Arxiv

0+阅读 · 2023年6月5日

EEL: Efficiently Encoding Lattices for Reranking

Arxiv

0+阅读 · 2023年6月1日

BitE : Accelerating Learned Query Optimization in a Mixed-Workload Environment

Arxiv

0+阅读 · 2023年6月1日

Efficient algorithms for certifying lower bounds on the discrepancy of random matrices

Arxiv

0+阅读 · 2023年6月1日

Efficient Bi-Level Optimization for Recommendation Denoising

Arxiv

0+阅读 · 2023年6月1日

Representation Learning on Hyper-Relational and Numeric Knowledge Graphs with Transformers

Arxiv

0+阅读 · 2023年6月1日

Optimal Sampling-based Motion Planning in Gaussian Belief Space for Minimum Sensing Navigation

Arxiv

0+阅读 · 2023年6月1日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

相关基金

基于网络解析的低压切负荷在线决策方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

云计算框架下大规模科学计算安全外包协议研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于频繁更新的大图数据查询和管理技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于关联和连锁分析的茄子耐冷基因发掘

国家自然科学基金

0+阅读 · 2013年12月31日

Cache访问优化的空间数据库查询处理技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

属性匹配在多源空间数据融合中的应用研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于树型自动机的数据库安全理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

计算几何中曲线曲面插值的研究与应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员