The proliferation of RDF datasets has resulted in studies focusing on optimizing SPARQL query processing. Most existing work focuses on basic graph patterns (BGPs) and ignores other vital operators in SPARQL, such as UNION and OPTIONAL. SPARQL queries with these operators, which we abbreviate as SPARQL-UO, pose serious query plan generation challenges. In this paper, we propose techniques for executing SPARQL-UO queries using BGP execution as a building block, based on a novel BGP-based Evaluation (BE)-Tree representation of query plans. On top of this, we propose a series of cost-driven BE-tree transformations to generate more efficient plans by reducing the search space and intermediate result sizes, and a candidate pruning technique that further enhances efficiency at query time. Experiments confirm that our method outperforms the state-of-the-art by orders of magnitude.
翻译:RDF数据集的增加导致了重点优化SPARQL查询处理的研究。现有的大部分工作都集中在基本图模式(BGP)上,没有考虑其他在SPARQL中非常关键的操作符,例如UNION和OPTIONAL。这些操作符的SPARQL查询,我们简称为SPARQL-UO,给查询计划生成带来了严重挑战。在本文中,我们提出了使用BGP执行作为构建块的执行SPARQL-UO查询的技术,该技术基于查询计划的一种新颖的基于BGP的评估(BE)树表示。在此基础上,我们提出一系列以成本为驱动的BE-tree转换,通过减少搜索空间和中间结果大小来生成更有效的计划,并提出一种更进一步提高查询效率的候选人修剪技术。实验证实,我们的方法在效率方面远远优于现有技术。