Bitvector filtering is an important query processing technique that can significantly reduce the cost of execution, especially for complex decision support queries with multiple joins. Despite its wide application, however, its implication to query optimization is not well understood. In this work, we study how bitvector filters impact query optimization. We show that incorporating bitvector filters into query optimization straightforwardly can increase the plan space complexity by an exponential factor in the number of relations in the query. We analyze the plans with bitvector filters for star and snowflake queries in the plan space of right deep trees without cross products. Surprisingly, with some simplifying assumptions, we prove that, the plan of the minimal cost with bitvector filters can be found from a linear number of plans in the number of relations in the query. This greatly reduces the plan space complexity for such queries from exponential to linear. Motivated by our analysis, we propose an algorithm that accounts for the impact of bitvector filters in query optimization. Our algorithm optimizes the join order for an arbitrary decision support query by choosing from a linear number of candidate plans in the number of relations in the query. We implement our algorithm in Microsoft SQL Server as a transformation rule. Our evaluation on both industry standard benchmarks and customer workload shows that, compared with the original Microsoft SQL Server, our technique reduces the total CPU execution time by 22%-64% for the workloads, with up to two orders of magnitude reduction in CPU execution time for individual queries.
翻译:位元过滤是一种重要的查询处理技术,可以大幅降低执行成本,特别是复杂的决定支持多个组合的询问。尽管应用范围很广,但是,它对于查询优化的影响并没有得到很好理解。在这项工作中,我们研究比特消过滤器过滤器如何影响查询优化。我们显示,将比特消过滤器过滤器直接纳入查询优化中,可以通过查询中关系数量的指数系数增加计划空间复杂性。我们用比特消过滤器分析计划,用于在右深层树木的计划空间中进行恒星和雪花查询,而没有交叉产品。令人惊讶的是,尽管有一些简化的假设,但比特消过滤器对查询优化的影响并没有很好理解。在查询中,比特消过滤器过滤器过滤器的最小成本计划可以从一个线性数量的计划中找到。这大大降低了此类查询的计划复杂性,从指数到线性化到线性。我们根据我们的分析,我们提出了一个算法,用来计算比特消过滤器过滤器在查询优化中的影响。我们的算法优化了对任意决定支持查询的组合顺序,方法是从执行中选择一个直线性的候选人计划,从执行时间段数,比特级过滤器过滤器过滤器过滤器过滤器过滤器在22次中,我们的标准水平上,我们的标准算算中,我们用微微软SL 和SL 和SL 递化的算算法在22服务器总算法在计算。