Learned indices have been proposed to replace classic index structures like B-Tree with machine learning (ML) models. They require to replace both the indices and query processing algorithms currently deployed by the databases, and such a radical departure is likely to encounter challenges and obstacles. In contrast, we propose a fundamentally different way of using ML techniques to improve on the query performance of the classic R-Tree without the need of changing its structure or query processing algorithms. Specifically, we develop reinforcement learning (RL) based models to decide how to choose a subtree for insertion and how to split a node, instead of relying on hand-crafted heuristic rules as R-Tree and its variants. Experiments on real and synthetic datasets with up to 100 million spatial objects clearly show that our RL based index outperforms R-Tree and its variants.
翻译:为了用机器学习(ML)模型取代像B-Tree这样的典型指数结构,建议了学习指数,以取代B-Tree这样的典型指数结构。它们需要同时取代目前由数据库使用的指数和查询处理算法,而这种彻底的偏离可能会遇到挑战和障碍。相反,我们提出了一种根本不同的方法,即使用ML技术来改进经典R-Tree的查询性能,而不必改变其结构或查询处理算法。具体地说,我们开发了基于强化学习的模型,以决定如何选择插入的子树和如何分割节点,而不是像R-Tree及其变体那样依赖手制的超光速规则。用多达1亿个空间物体进行的实际和合成数据集实验清楚地表明,我们的RL基于索引的索引比R-Tree及其变体更优。