Although spatial index structures shorten the query response time, they rely on complex tree structures to narrow down the search space. Such structures in turn yield additional storage overhead and take a toll on index maintenance. Recently, there has been a flurry on works attempting to leverage machine-Learning(ML) models to simplify the index structures. Some follow-up works extend the idea to support geospatial point data. These approaches partition the multidimensional space to cells and assign IDs to these cells using space-filling curve(e.g., Z-order curve) or mathematical equations. These approaches work well for geospatial points but are not able to handle complex geometries such as polygons and trajectories which are widely available in geospatial data. This paper introduces GLIN, a lightweight learned index for spatial range queries on complex geometries. To achieve that, GLIN transforms geometries to Z-address intervals, and builds a hierarchical model to learn the cumulative distribution function between these intervals and the record positions. The lightweight hierarchical model greatly shortens the index probing time. Furthermore, GLIN augments spatial query windows using an add-on function to guarantee the query accuracy for both Contains and Intersects spatial relationships. Our experiments on real-world and synthetic datasets show that GLIN occupies 40-70 times less storage overhead than popular spatial indexes such as Quad-Tree while still showing similar query response time in medium selectivity queries. Moreover, GLIN's maintenance speed is around 1.5 times higher on insertion and 3-5 times higher on deletion.
翻译:虽然空间指数结构缩短了查询响应时间,但它们依靠复杂的树结构缩小搜索空间, 而这些结构反过来又产生额外的存储管理管理, 并对索引维护造成伤害。 最近, 试图利用机器学习(ML)模型来简化索引结构的工程出现了一阵风雨。 一些后续工作扩展了支持地理空间点数据的想法。 这些后续工作将多维空间分隔为单元格, 并使用空间填充曲线( 如Z- 顺序曲线) 或数学方程式将这些单元格的标识分配到这些单元格中。 这些结构对于地理空间点来说效果良好, 但却无法处理地理空间数据中广泛存在的多边和轨迹等复杂的地理异常。 本文介绍了GLIN, 用于对复杂地理分布进行空间范围查询的轻量学习指数。 为了达到这个目的, GLIN 将多边空间空间空间空间空间空间空间空间定位转换为Z- 间距, 并建立一个等级模型来学习这些间距和记录位置之间的累积分布函数。 较轻的等级级级级级级级级模型仍然大大缩短了指数测试时间。 此外, GLIN 将空间定位和空域缩缩缩缩缩缩缩阵列, 显示了G- 。