Many techniques have been developed for the cardinality estimation problem in data management systems. In this document, we introduce a framework for cardinality estimation of query patterns over property graph databases, which makes it possible to analyze, compare and combine different cardinality estimation approaches. This framework consists of three phases: obtaining a set of estimates for some subqueries, extending this set and finally combining the set into a single cardinality estimate for the query. We show that (parts of) many of the existing cardinality estimation approaches can be used as techniques in one of the phases from our framework. The three phases are loosely coupled, this makes it possible to combine (parts of) current cardinality estimation approaches. We create a graph version of the Join Order Benchmark to perform experiments with different combinations of techniques. The results show that query patterns without property constraints can be accurately estimated using synopses for small patterns. Accurate estimation of query patterns with property constraints require new estimation techniques to be developed that capture correlations between the property constraints and the topology in graph databases.
翻译:为数据管理系统中的最基本估计问题开发了许多技术。 在本文件中,我们引入了一个框架,用于对属性图数据库的查询模式进行最基本估计,从而可以分析、比较和合并不同的基本估计方法。这个框架包括三个阶段:为某些子类获得一套估计,扩大这一组,最后将这套方法合并为查询的单一基本估计方法。我们表明,许多现有的最基本估计方法(部分)可以作为我们框架的一个阶段的技术使用。这三个阶段是松散的,这样可以将当前最主要估计方法(部分)结合起来。我们创建了一个“联合命令”基准的图表版本,用不同的技术组合进行实验。结果显示,在没有财产限制的情况下,查询模式可以精确地使用小模式的合成来估计。对有财产限制的查询模式进行精确的估计,需要开发新的估计技术,以掌握在图形数据库中的财产限制和表层学之间的关系。