Learned indices have been proposed to replace classic index structures like B-Tree with machine learning (ML) models. They require to replace both the indices and query processing algorithms currently deployed by the databases, and such a radical departure is likely to encounter challenges and obstacles. In contrast, we propose a fundamentally different way of using ML techniques to improve on the query performance of the classic R-Tree without the need of changing its structure or query processing algorithms. Specifically, we develop reinforcement learning (RL) based models to decide how to choose a subtree for insertion and how to split a node when building an R-Tree, instead of relying on hand-crafted heuristic rules currently used by R-Tree and its variants. Experiments on real and synthetic datasets with up to more than 100 million spatial objects clearly show that our RL based index outperforms R-Tree and its variants in terms of query processing time.
翻译:为了用机器学习(ML)模型取代典型的指数结构,例如B-Tree,提出了以机械学习(ML)模型取代B-Tree等典型的指数指数。它们需要替换目前由数据库使用的指数和查询处理算法,而这种彻底的偏离可能会遇到挑战和障碍。相反,我们提出了一种根本不同的方法,用ML技术改进经典R-Tree的查询性能,而不必改变其结构或查询处理算法。具体地说,我们开发了基于强化学习(RL)的模型,以决定如何选择用于插入的子树,以及如何在建造R-Tree时分割节点,而不是依赖R-Tree及其变体目前使用的手工制作的超光速规则。用多达1亿个空间天体组成的真实和合成数据的实验清楚地表明,我们基于RL的索引比R-Tree及其变体在查询处理时间上的变体更形。