Selectivity estimation refers to the ability of the SQL query optimizer to estimate the size of the results of a predicate in the query. It is the main calculation, based on which the optimizer can select the cheapest plan to execute. While the problem is known since the mid 70s, we were surprised that there are no solutions in the literature for the selectivity estimation of inequality joins. By testing four common database systems: Oracle, SQL-Server, PostgreSQL, and MySQL, we found that the open-source systems PostgreSQL and MySQL lack this estimation. Oracle and SQL-Server make fairly accurate estimations, yet their algorithms are secret. This paper thus proposes an algorithm for inequality join selectivity estimation. The proposed algorithm has been implemented in PostgreSQL and sent as a patch to be included in the next releases.
翻译:选择性估算是指 SQL 查询优化器估算查询中上游结果大小的能力。 这是主要计算方法, 优化者可以据此选择最廉价的计划执行。 虽然自70年代中期以来问题已经众所周知, 我们感到惊讶的是, 文献中没有关于不平等的选择性估算的解决方案。 通过测试四个共同数据库系统: Oracle、 SQL- Server、 PostgreSQL 和 MySQL, 我们发现开放源系统 PostgreSQL 和 MySQL 缺乏这一估算。 Oracle 和 SQL 服务器做了相当准确的估计, 但他们的算法是保密的。 因此, 本文提出了不平等的算法, 加入选择性估算。 提议的算法已在 PostgreSQL 中实施, 并发送作为补丁, 将包括在下一期发布中 。