Spatial join processing techniques that identify intersections between complex geometries (e.g.,polygons) commonly follow a two-step filter-and-refine pipeline; the filter step evaluates the query predicate on the minimum bounding rectangles (MBRs) of objects and the refinement step eliminates false positives by applying the query on the exact geometries. We propose a raster intervals approximation of object geometries and introduce a powerful intermediate step in pipeline. In a preprocessing phase, our method (i) rasterizes each object geometry using a fine grid, (ii) models groups of nearby cells that intersect the polygon as an interval, and (iii) encodes each interval by a bitstring that captures the overlap of each cell in it with the polygon. Going one step further, we improve our approach to approximate each object by two sets of intervals that succintly capture the raster cells which (i) intersect with the object and (ii) are fully contained in the object. Using this representation, we show that we can verify whether two polygons intersect by a sequence of joins between the interval sets that take linear time. Our approximations can effectively be compressed and can be customized for use on partitioned data and polygons of varying sizes, rasterized at different granularities. Finally, we propose a novel algorithm that computes the interval approximation of a polygon without fully rasterizing it first, rendering the computation of approximations orders of magnitude faster. Experiments on real data demonstrate the effectiveness and efficiency of our proposal over previous work.
翻译:暂无翻译