In recent years, machine learning-based cardinality estimation methods are replacing traditional methods. This change is expected to contribute to one of the most important applications of cardinality estimation, the query optimizer, to speed up query processing. However, none of the existing methods do not precisely estimate cardinalities when relational schemas consist of many tables with strong correlations between tables/attributes. This paper describes that multiple density estimators can be combined to effectively target the cardinality estimation of data with large and complex schemas having strong correlations. We propose Scardina, a new join cardinality estimation method using multiple partitioned models based on the schema structure.
翻译:----
近年来,基于机器学习的基数估计方法正在取代传统方法。这种变化预计会对基数估计的最重要应用程序,即查询优化器,加快查询处理速度做出贡献。但是,在关系模式由许多表格组成且表格/属性之间存在强相关性的情况下,现有方法都无法精确估计基数。本文说明了多个密度估计器可以组合起来,针对具有强相关性的大型和复杂模式的基数估计进行有效目标。我们提出了Scardina,这是一种基于模式结构的多个分区模型的新连接基数估计方法。